2 Feb. 2006

Measuring software reliability

I attended a talk by John Munson entitled “Measuring Software Reliability”. The gist of it was:

“Tell me how you use the software and I’ll tell you how reliable you’ll be.”

According to the speaker, software can be mapped from a description of its use (operational specification), to a description of its design (functional specification), to a description of its actual implementation in code (modular specification). This gives you the ability to go from use case to code that’s executed during that use case.

Measurements of the software are then taken over a period of time (though what exactly these measurements are was never explained), and the reliability of the software at the current time can be calculated from its past performance.

Does anyone else see where this breaks down in the real world? Case example: the Y2K bug. Everything is running fine for decades. We have a solid history of reliability. Then all of the sudden, we’re up against a scenario that the designers never considered, and suddenly reliability goes down the tubes.

Plus, you have to realize that customers are motivated by features and companies are motivated by money. You have to constantly be putting in new stuff, or your customer’s going to go looking for the competition when you can’t provide her with what she wants. Release cycles where the code is continually changing really mess with your ability to measure reliability.

Change is inevitable. The sooner we start looking at software engineering as a discipline that’s unprecedented (because it is), the better off we’ll be. The practices that work for the mechanical engineers and the electrical engineers don’t work for us. Software is far more flexible and bounded by far fewer laws than some academics would have you believe.

I will, however, agree that there are some instances where this kind of measurement could be really useful. ATMs, airplane flight control systems, and anything else where the operational environment is controlled, and all the inputs are known, could probably benefit from reliability measurement. Interestingly enough, those are the kinds of environments where you would want reliability measurement, because they’re the ones that are high risk.

So it’s one of those things that works well in academia and fault tolerant systems, but is of little use when it comes to building applications that real people will use to do many different things.

Notes

Here are my notes from the talk. The roughly correspond to the PowerPoint slides presented.

The Notion of Reliability

Related to the notion of a failure event
Likelihood that a system will not fail during some usage interval
Failure event is observable
Time between failure events is observable

Characteristics of Hardware Systems

All manufactured systems are different
Different molecules
Different mechanical structures
Slight variation in manufactured components
Will wear out over time
Will eventually fail

Characteristics of Software Systems

All manufactured systems are identical
Software is eternal
Software is ethereal
Software does not change as a function of time
Software does not wear out

Measuring Software Behavior

Software systems do not fail monolithically
Software modules fail
Not all features execute all modules
Some features execute some fail prone modules

Functional Reliability

Programs made up of many structurally independent modules
Some modules are good (fault free)
Some modules are bad (fault prone)
Each functionality exercises only a small set of program modules

Software Physics

The notion of time has no meaning in software
Programs do not fail because they wear out
Programs do not change with respect to time
Past reliability of a program is not a good predictor of future reliability (Y2K)

3D Mouse Example

User manipulates a tennis ball
Put spots on a tennis ball and user pattern recognition
Put accelerometers in a tennis ball and measure the delta
Operations map to Functionalities (O x F)
Functionalities map to Modules (F x M)

Measurement

There are probabilities for the execution of operations, functions, and modules
Watch your software execute and see the patterns of execution
Build a meter to give an estimate of reliability right now based on past performance