Ben Bakowski's posts

Recaptcha

Friday, September 10th, 2010

Anti-spamming software is an integral part of many websites, often taking the form of “type in the distorted text you see”. The assumption is that this is – hopefully – too difficult for a machine to do quickly and trivially, but easy for humans with our better pattern-understanding skills.

reCAPTCHA is an example of this which takes this a step further: the second word is used to confirm I am indeed human and not spamming, while the first word is text from a failed attempt to digitize historic documents: i.e., reCAPTCHA provides a (free!) way to get humans to help digitize old text from before the computer era. This is a great idea, but does throw up some interesting examples in practice:

It’s obviously rather harsh to expect someone to be able to enter (i) a mathematical forumla with superscript and subscripts and (ii) Greek text, particularly as the text box only accepts plain English with no formatting. More worryingly, I got thinking about whether other even more inappropriate (i.e., offensive) data might wing its way to the user. I followed this up, and got the following response:

“The facility does have filters in place to prevent offensive words coming up…  some of the words in these texts are difficult for computers to process, we are using the results of your efforts to help decipher them.”

Which makes me wonder: if the texts are difficult to process, how can we truly be confident in any filters? And how should a tester go about raising a concern like this: is it a real defect? Or just a worry…?

Test your attitude to risk

Tuesday, May 11th, 2010

The Times has an interesting article on investors’ attitudes to financial risk tolerance. Data suggest that investors’ perceived and actual risk tolerances can markedly differ, affecting the suitability of their investment portfolios.

So why’s this relevant? Well, as testers we always make risk-based calls on what to test and to what extent. We therefore need to understand how our perception of our own risk tolerance maps to that of the business so that these decisions are in line with what the delivery, and business, needs.

For a short while The Times has teamed up with finametrica.com who provide an online risk tolerance questionnaire. It’s obviously geared towards financial risk, but it’s well worth a look to see how your own risk tolerance maps to that of the adult population.

A personal note on agile and quality

Friday, April 9th, 2010

Just been involved with the beta testing of the new Feature Pack for OSGi Applications and JPA 2.0 for IBM WebSphere Application Server V7:

OSGi and JPA 2.0 FeP

We’ve been using a much more agile development process than in the past, continually tweaking our approach and trying out new development and test tooling. I’m really pleased to say that this investment has paid off: even as a hardened test cynic, I’m genuinely impressed by the quality.

Make WAR not love

Friday, February 12th, 2010

I’ve been testing a tool that manipulates Java web application archives (WARs) – but it’s a challenge to ensure I’ve got a representative cross-section of real-life WARs to test against.  So far, I’ve been hunting for samples and coding up my own WARs – convolving the Java specifications with common patterns I’ve seen recommended on developer forums. It’s a shame there doesn’t seem to be a repository out there of a broad range of samples for me to deploy to my application server and test the tool against.

Unless anyone’s aware of anything different…?

Not everything’s new and shiny

Tuesday, January 12th, 2010

We recently added polls to TestingBlues, so of course I immediately went to exercise my voting right – only to be met with a badly rendered and ugly poll sitting at the bottom of the page.

ugly post

Viewing a TestingBlues poll in IE6

After a bit of digging, I found out the fault’s mine: I use Internet Explorer 6. The polling feature – provided as a 3rd party plugin – doesn’t support IE6 properly. However, we have decided to keep the polling functionality for now as we believe polling functionality for 90% of users is better than no functionality at all.

In other words, we have tested this website, but made the decision to ship with a known problem. We hope readers now understand why – and aren’t under the perception that we don’t test properly. After all, that would be some irony…

Postscript: It’s my choice to use IE6 of course. It’s important to use (and test against!) older and sometimes unpopular versions of software – simply because some users may not have the luxury of the choice of a newer, shinier release. After all, there’s a lot of IE6 users out there: check out some recent usage charts on http://www.betanews.com/article/Statistics-Firefox-35-surpassed-IE7-in-global-usage-share-last-week/1261428919. This did make me think about browser usage data. Perhaps my own usage of IE 6 perturbs these metrics, so ultimately I’m testing software purely because I use it. Sort of Heisenberg’s Uncertainty Principle applied to software: an uncomfortable thought somewhat mitigated by the vanishingly small impact I have…

That won’t do nicely

Friday, May 22nd, 2009

At the end of 2008 I switched credit cards from provider X to provider Y, lured by the promises of untold wealth from the cashback deal they offered. However, I’m yet to have a month where using this card has gone smoothly: which makes me wonder how well their application and business logic has been tested.

It was all triggered by me setting up a direct debit to pay the card off each month, and I thought nothing more of it. I was therefore somewhat surprised when the next month no payment was taken and I was spanked for interest. I phoned up – and was informed that my direct debit had indeed been set up, but flagged to start in the year 8888. At first glance, this looks like a user error which the software should reject. But then you could argue that 8888 is a valid year – just a very unusual one to use in a direct debit.

So where do we mark the cutoff between sensible data entries and wrong ones? Can we implement fuzzy logic for such subjective testing? And if so, how can we guarantee repeatable behaviour between tests – which is critical for automated testing?

Anyway – the direct debit is now working, but this appears to have triggered me travelling down some sort of error path in the credit card company’s business logic. I’ve decided to stop using this card for 2 months to let the dust settle: and then perhaps I’ll test their business logic some more…

Risk in other industries

Wednesday, January 21st, 2009

There was a very interesting programme on BBC2 on how risk is managed (or rather mismanaged) in the financial markets. The City Uncovered with Evan Davis is well worth watching – even if just to understand how human nature sacrifices risk in the pursuit of performance.

It’s well summed up Evan Davis’ closing statement: “If you think you’ve got risk licked – you haven’t”.

What kind of engineer are you?

Wednesday, January 7th, 2009

Over the holidays I got asked, “what makes a good software tester?”. As well as recounting the traditional traits of sheer good looks, athleticism and sophisticated banter, this actually got me thinking back to what we are here to do. Crudely speaking, we are here to find defects. [Note that I don't state "find and fix defects" - as that's a topic for another discussion].

To do this well requires a broad range of character traits – just like our customers have a broad range of approaches to using our software. So when asked if I’d want a tester who takes a very rigorous approach, or one with a much more ad-hoc attitude to testing – I’d happily take both.

Defects are oily…

Tuesday, December 9th, 2008

How often have you heard, “We have to fix the low severity defects now, or they will never get fixed!” – even when you know some high priority core functionality is still not working?

Stepping back, fixing low severity defects may not seem in line with the agile approach, when your backlog is tackled in priority order.  However, I like to think of defects as like droplets of oil: they aggregate, and collectively they can form a much bigger droplet and a more serious problem. This seems particularly true of consumability defects.

In this case, resolving low severity defects collectively is indeed a high priority task. But there are challenges:

  • How can we track this aggregation – and hence prioritise defect resolution? Informally, through impact on user stories? Or through modifying defect tracking tools to recognise that defects are not necessarily isolated issues, but instead can mutually reinforce?
  • Is it an “all or nothing” approach to resolving the aggregated defects, or should you pick off individual ones to shunt the overall problem back down the priority stack?
  • Finally, I have interchanged “severity” and “priority” in this post. Purists will argue these are separate attributes for a defect – but I think this post demonstrates how they can be strongly correlated.

Blogging overrides marketing?

Friday, November 28th, 2008

Interesting to see that blogging could be the new “make or break” for a product:

 http://www.bbc.co.uk/blogs/technology/2008/11/can_stephen_fry_kill_a_gadget.html

Is this because blogging is so immediate – so you’re much more likely to put down your first impressions? A powerful argument for ensuring that you get that all those consumability niggles sorted…