Testing safety-critical systems

March 15th, 2010

David Cummings has written an article for the Los Angeles Times about his experiences with testing on the Mars Pathfinder project, and how that might relate to Toyota’s recent problems:

If Toyota has indeed tested its software as thoroughly as it says without finding any bugs, my response is simple: Keep trying. Find new ways to instrument the software, and come up with more creative tests. The odds are that there are still bugs in the code, which may or may not be related to unintended acceleration. Until these bugs are identified, how can you be certain they are not related to sudden acceleration?

This brings us back, as ever, to the question of “when do you stop testing?”.  How do you answer that question when safety is involved?  At some point, you have to ship your product.  With software, we are often afforded the luxury of updating it afterwards, but it isn’t always possible (problems with installation programs usually fit into this category, as do many embedded systems).  We can never prove the absence of errors.


Unit testing achievements

March 3rd, 2010

Here’s a fun take on unit testing achievements (in the XBox 360 sense of the word “achievement”): http://exogen.github.com/nose-achievements/

The gateway to release

February 26th, 2010

The executive who decides whether or not you will be allowed to release your product, will not be looking at your product code to make that decision. They will be looking at your defect numbers and your test pass rates.
This is how I started a conference call last week, as part of a session to encourage more focus on test.
In a truly Agile team there are no dedicated testers or developers. Everyone is a software engineer and everyone shares responsibility for the quality of what is produced.
In an ideal world everyone would harbour a deep desire to ensure testing has been done fully, before even considering the possibility of coding more features. Sadly, in reality it is all too tempting to charge ahead with the next feature on the list. Because it is so hard to know when you are ‘done’ testing and so easy to see the features you’ve not started yet.

So I find myself trying to raise the profile of the business of testing, why we do it, and how our process ties into it. Software engineers find it easy to understand all the features that need to be written to meet the requirements. What is needed is a gentle reminder that the gateway to release is controlled by someone who will not try the product, they will not read the code, or even the documentation. The measures that they use to determine fitness for release are defect trends and test passes. They look to see that the rate of high severity defects being raised has dropped off, and that the tests are, if not 100% passing, then very close to it.

In a traditional team, one or two key people needed to worry about that and direct their fixed teams accordingly.  Part of being an Agile team is making sure that everyone understands these metrics and takes their share of responsibility to prioritise test in their planning.

Shortly after my conference call, one of the local component leads told me that another team had let him know they would not be developing the feature he wanted until the next sprint, because their plan was full with testing in the current sprint.  What’s more, he understood and accepted that this took priority over his request. So, for the moment at least, the message is getting through.

Testing large-scale web applications

February 17th, 2010

Following the release of the first graphical interweb browser back in 1993 a whole new field of software development was opened up – that of the internet application.

Whilst we now take for granted the ability to enjoy our banking, shopping, socialising and entertainment online, back in the “early days” this future was by no means assured:

The proliferation of web applications and their increasing complexity pose huge challenges to us as testers. In this post we’ll take a *very high-level* tour of what to look out for, and how we might approach testing web applications.

Read the rest of this entry »

Make WAR not love

February 12th, 2010

I’ve been testing a tool that manipulates Java web application archives (WARs) – but it’s a challenge to ensure I’ve got a representative cross-section of real-life WARs to test against.  So far, I’ve been hunting for samples and coding up my own WARs – convolving the Java specifications with common patterns I’ve seen recommended on developer forums. It’s a shame there doesn’t seem to be a repository out there of a broad range of samples for me to deploy to my application server and test the tool against.

Unless anyone’s aware of anything different…?

Code coverage with Cobertura

February 8th, 2010

I discovered the Cobertura code coverage tool the other day.  It is an elegantly simple tool, which produces simple, easy to read reports.  My current project is using Apache Maven for its build system, and with the Cobertura Maven plugin provided, I simply added a couple of lines to my project file, and ran a build.  From start to finish, I spent less than five minutes to go from nothing to having some really useful code coverage data.  The report even includes a code complexity metric to help you identify the more complex areas of code.

I’m impressed by the simplicity of this tool – quite a few years ago, I spent months and months getting code coverage data for a large project (code coverage was relatively new then), and it’s great to see how easy it has become to use this technology.  I’ve previously used Emma for code coverage, which is also a great tool, but I think Cobertura has the edge when it comes to reporting.

Targeted Testing

January 19th, 2010

The final production build of the software is ready. The last full run of testing starts. After two weeks, a potentially serious problem is found and the product manager decides it has to be fixed. The fix is rushed in, unit tested and built into a new production build.

Now, there are only two days left before the product must be shipped to customers. The three week final test phase now goes “out the window” and the test team do the best they can to verify that the product still works without any serious regressions.

This is an example of “Targeted Testing” being applied. Every test team has to do it at some stage, there simply isn’t enough time and resource to do an ideal job.  The effectiveness of that final test of the shipped product depends on the skills of all those involved in the decisions affecting every aspect of the life cycle of that last change – and probably a great deal of luck.

“What I need is a list of specific unknown problems we will encounter.”
(“Dilbertism” from Lykes Lines Shipping)

Time constraints like this are actually happening throughout the whole product development cycle. All the tests in the official plan may be run in several test phases, but by the end of each phase, the product has moved on, and in an ideal world, all those tests (and some not even dreamed of yet) need to be re-run.

Wouldn’t it be great if there were some tools and techniques to help do Targeted Testing so that at any time in the lifecycle of the product, we could know exactly which test is most likely to find a possible problem? All the available tests would be dynamically ordered based on this ever changing likelihood of finding problems. That way, we could be sure that the best possible testing was being done at any point in time in whatever time is available.

Read the rest of this entry »

Identity theft in web applications

January 19th, 2010

I found this article in a BCS security news letter that I received in my inbox this morning.

It provides an interesting angle on the testing (checking) of web applications, as even a seemingly trivial deployment may be exploited as part of a system attack.

Read the rest of this entry »

Testing v Checking

January 15th, 2010

Had a great meeting this morning with a number of senior testers from Hursley.
The main theme of the discussion was around an article found by Russell Finn, ‘Testing v Checking‘ – where basically checking is doing what you are told and testing is doing what you feel should be done (verification v validation might be a better way of putting it).
We have evidence from one of our major products that 50% of our field reported problems are things no one had considered (not designed, coded or tested for). So whilst we are pretty good at the verification side of things, there’s a lot we miss becuase we are not testing to real stakeholder needs.
A solution we discussed was around exploratory testing where the emphasis is on learning about a system and improving it through testing.

We also discussed quality metrics and how they scale. Traditional methods of counting defects per 1000 lines of code are fundamentally flawed because they don’t consider the impact on the end user.  One defect in a 100 line application might sound good, but what if that code is distributed across a million devices and each one needs an update?

Perhaps a better method would be ‘mean time to defect’ – some sort of measure around the time it takes an average end user to encounter a new defect.

Final thoughts were around the combative nature of a tester raising defect. Each time we raise a defect we are effectively telling the coder their baby is ugly. Is there a better way of doing this to get testers and coders working cooperating more? One suggestion was to change the term defect to ‘opportunities for improvement’ – not ideal I know but I like the sentiment. Comment if you have any better ideas.

Like Software Testing? Like Clubs?

January 13th, 2010

Then try softwaretestingclub.com.

Rosie Sherry from softwaretesingclub.com left a comment to my last post after discovering TestingBlues for the first time. I’ve had a browse through the content over there and am really impressed, I particularly like the exchange – a place to ask testing questions.

Who knows, we may be able to collaborate in the future.