A road story
I'm about to say one thing for the bravest only.
So, since you are one of them, read on - if you dare :)
Let me begin with an example.
You (your mom, your dad, the schoolbus driver) are driving, and all of a sudden a child crosses your path on his bike...
In the normal path on events, you brake instantly, with all your strength. A wonderful little device, named ABS, detects your pressure on brake, would try translating it in an actual command to wheels, then detect tyres are sliding because of the excessive braking force, then release the brake a bit allowing the tyre to recover adherence to the ground, then try again to close the brake a bit more. And so on.
The whole sequence has taken less than a second. The car has stopped, and the impact avoided.
This, naturally, if the ABS was operating.
And, what if it failed?
Things would have gone differently, in that unfortunate case. But chances are very high neithe you nor your immediate descendants will ever witness such an episode. The reason: your car's ABS is designed to be fault-tolerant, up to a number of failures. And when this number (maybe two, or three), it is guaranteed to "fail to safe", blocking your car on place when you try starting it, with the only fix possible is replacing the ABS.
Unlike your phone, or computer, the ABS is designed according to the principles of functional safety, and specific efforts are made ensuring its behavior remains predictable even after some faults.
We are used to think ICT as something dealing with information we human use directly, like this article, or maybe a crypto-currency transfer.
But the largest amount of ICT activity occurs without us even realizing it exists. Information is massively collected by sensors (which internally may contain a tiny computer - like popular three wires "linear thermistors", which in reality are normal thermistors with a tiny microcontroller built in who reads the logarithmic signal of the real sensor and converts it, digitally, to a nice straight temperature-voltage response), processed, and then used to trigger actuators doing something useful. Like braking, as you have seen. Or, pumping some liquid detergent into your container while measuring it and printing a price tag, or commanding a door to open, or...
We may not realize, but systems like these are already around of us in the tens, maybe hundreds downtown. They serve their very specific purposes without our intervent, unobtrusively, their measure of success being how unnoticed they remain.
The (dire) concept of "risk" - in the technical sense
Some of them, in case of failure (and faults just happen, thanks to the combined hard facts computes are in the end minerals, and all minerals on this planet are subject to weathering - have you noticed how seldom sharp edges occur on natural objects?) may harm people, or other beings, or property, or anything deemed (what a horrible word...) "valuable".
In these case, a very first thing to do is, evaluating the risk associated to failure. We all have our own intuitive understanding of the word "risk", as thin damn thing we in most cases try to avoid. In the context of safety engineering, the word "risk" has a precise definition, as suggested by a formula:
Risk = Magnitude of undesired event * Event probability of occurrence
The "magnitude" must be expressed in a way or the other, and stakeholders agreed to measure it depending on context: in casualties, when human life is at direct harm. Or, dollars lost in case of a financial havoc. Other definitions exist, too.
The "probability of occurrence", on the other side, is an estimate. The best we can devise (and there are books over books explaining how to obtain them, under the umbrella term of "reliability engineering").
So, a "high" risk is associated to some catastrophic event (say, a person losing her hand into a mill, or a worker killed), or a high probability of occurrence, or maybe both.
Things may become really difficult when we face a, sorry for that, catastrophic catastrophe like the Three Miles Island incident, whose probability occurrence is very small. Is it a "high" risk? Or a low and acceptable one, provided the probability is small enough?
Really dire decisions. And, all those politicians who we pay so much are there just to reach consensus on sensible answers to this and other similar questions.
But, let us return with our feet steadily planted on the ground. Let's assume, someone based on a rigorous evaluation risk allocates to our application a "safety integrity level". In my times, ad using my beloved and now surpassed DIN Norms, it was a magic number ranging from 0 to 7, that is, from "don't care" to "absolute, Godzilla-sized danger".
The higher the safety integrity level, the more aggressive, deep, costly and throughout measures you have to take to ensure your product does not harm - or, more exactly, that the harm it is still able to cause is "acceptably small and/or unfrequent".
Oh, of many many kinds.
Some, organisational: like choosing the right programming language, a processor with a simple-to-analyse architecture, the right team; implementing the training necessary; ensuring people involved in construction share all the information needed; and, so on.
Some (in fact) many others are, instead, technical: which self-test protocols to use; which kind of redundancy in input channels and processing units; what to do when faults occur (like "forcing the system to a safe state"); how to prevent fault - and software bug; and so on and so on (my VDE 0801 beloved manual was many hundreds pages long, all filled with tiny technical requirements).
Of course, doing an ICT system according to rules like these happens at a big cost.
But in change, you have a system which, under failure, behaves predictably - at least, to some socially acceptable level.
Of course, trading money for safety is not specific to ICT systems. In fact, many examples are around us since centuries, so pervasive we take them for granted. Think, for example, to braking systems used in railway industry since its inception: its command system is based on void instead of high pressure. If the train has to brake, the train-(wo)man lets air enter the command pipes, instead of pumping it out. This, because would a command pipe break, then air would enter the system and the train would brake automatically until stopping - the train' safe state. Would on the contrary brakes be operated by pumping a high pressure fluid into the command system, than a breach in a pipe would prevent the train to brake. And that would be not so safe...
You may wonder why in automotive industry the (unsafe!) high-pressure approach has been followed instead. This is a long story, rooted in costs, profits, and mass market. And you may also notice that to drive a potentially lethal car you just have to take a simple exam, so simple even me (a really bad driver) was able to pass with little effort. While if you want to drive a train, a big ship, of an airplane, you must withstand a long initial training plus lifelong update.
Here we're getting a bit far, I'm afraid, from the usual male turf.
The same measures allowing a potentially dangerous system to behave predictably and, to any possible extent, not unsafely, would also yield personal computers which do not freeze, word processors which do not demand you save the work every ten minutes because it may crash, and similar desirable properties.
Said differently, safety measures are not safety-specific: there are just good engineering principles.
So, why are not they used extensively?
Answer is complex, but, much of it has to do with Moore's law: computer power increased exponentially, while its cost declined - also exponentially. Part of the cost reduction was due to efficiency gains made available by technology. But another part (maybe larger) came from cutting all "un-necessary things" from mass produced systems. Among them, self test code, redundancy, hardware standardisation and quality. The resulting systems are, OK, way faster and cheaper. But, occasionally they crash without any apparent reason. And we users, crash after crash, have became less finicky, more tolerant to malfunctions, and last, ironic - the kind of irony of the powerless one in front of the vagaries of an incomprehensible World.
Risk-taking, and less time-to-market. Crushing competitors. This the ideology under this all (yes: it seems not, but technology evolves under the powerful drive of a dominant ideology - in the case of old style ICT, quite patriarchal-flavoured).
In the meanwhile, as complexity soared, people began to feel something magic in computer. Maybe, you have heard terms like "speculative execution of instructions" by some processor. Do you exactly know what it does mean? I don't. I began working in ICT around 1986, but (maybe just because of my Triassic formation) I still have not grasped intuitively the details of what speculative execution is. I got some general idea, but that's all. Sufficient to let some marketing specialist to try sell me a processor more speculative of some other. But not enough to decide whether using speculative execution in an application, or disabling it instead.
Whenever sense of magic, or sacred, is evoked, magicians and priests will come to light. Whether this is good or evil, I leave the decision up to you (I guess for evil, though).
Can we (we! the women! you gals!) do something, to make this better?
What am I saying, in the end?
As engineers, or future ones, we always do better staying practical.
So, let me say what's my actual point.
Of course I'm not asking for a complete redesign of any ICT piece on Earth.
More modestly, what I'm proposing is to indicate ways to make better use of existing technology, by adopting some principles of safety engineering (and other disciplines). In fields, where benefits would be great, but historically no big efforts have been made.
Some examples I've come in contact with during my life:
- Environmental monitoring.
- Decision support systems (for human health, and/or environment protection).
- Field data collection, processing and archiving for "non-safety-critical" uses.
- Data processing in support to scientific research, policymaking, or both.
Surely you know, or can imaginate, other fields. Well: we might add to the list.
Another point (and indeed a very important one) is we all become aware of, and possibly confident with, the technical concept of "risk". And have at least a little grasp of how can it be estimated sensibly and objectively, if not to make us more informed and demanding citizens in our countries.
Then, considering to take a career in ICT, and bring through it new, fresh perspectives, and maybe a more sustainable agenda.
Attitudes which, in my feeling, would made for a great, modern ICT expert could be:
- A for-people mind-set.
- Awareness of risk, in both the intuitive and technical senses - and willingness to avoid or reduce them.
- A drive to interact with people (maybe, not the stakeholders only) to get, and then organise, sets of requirements (functional, on safety, on security, on cost) which make sense, and around which it is possible to build consensus.
- Knowledge of technology and, specifically, the basic of programming electronic systems safety engineering.
- Passion. And love. And empathy. And, willingness to communicate.
You may have noticed some attributes deemed desirable in past times are not in my list. Geekyness, for example. Or, the proud to be considered a "coder", a "hacker". Or, being available to work 14 hour a day outing in the process any social connections. Or, mandatory maleness. That's not by chance: all this stuff is for the past. Useless, in the demanding world we're facing. Maybe to date still many employers still stitch to them, but this is mainly because they're run by old people, and old people does not excel at flexibility. Or strategic vision. All these old-style companies will close, or change deeply in a try to stay alive, one after another.
You may also wonder my list includes many "feminine" qualities. I can agree. They are labeled so, in our current (patriarchal) world. As neurosciences and human ethology say, they are in reality species-bound attributes, important parts of what makes us humans, and not monkeys. That is, they may come more natural to girls, on average to date, but do not exclude boys, unless they are willing to be excluded of course. Which would a real pity.
Safety engineering is more diffused in academia and industry than you may think, and my first advice is you spend some time to search in your area where courses are taken, and who demand it.
If you are so brave to wish making some practical experiments, to date there is plenty of objects well able to allow. We live in the age of Arduino, Raspberry Pi, and other similar "inexpensive" and readily available things. Get them, and learn using. Band together and get one, if you individually can't afford a specimen. Ask producers to get some sample for free or symbolic price. And, ask help - were here.
If you like, find and follow (maybe actively) blogs. I've started mine (you can find it by browsing for maurifavaron.it), and am wondering to place there some articles in English dealing with risk and the basics of safety engineering using Arduino examples. I've not done yet, however, waiting for your requests. But I promise: just one request coming, and I'll start this line.
Now, last, my thanks to you, for the time you took reading this post. And, my wishes for a brilliant future in ICT!