Category Archives: Software Testing

Putting Cucumber where it’s not supposed to go will hurt!

Today, I came across this post by Ryan Bigg where he talks of the pains he’s experienced with Cucumber.

Fortunately for Ryan, the outcome was a positive one, he ended up finding what appears to be a nice looking API for automating test-execution.

The experience he had with Cucumber, however, is a common one. Often teams start using Cucumber in an effort to reduce the need for programming skills when writing automated tests.

This isn’t the problem that Cucumber and other BDD tools are trying to solve and this is why so many teams who misuse it in this way experience difficulty and frustration.

BDD tools like Cucumber are designed to help teams and their customers arrive at a shared understanding of the problem they wish to solve (in terms of the user’s goals and tasks). It’s not really intended to describe the solution (i.e. step-by-step interactions with the GUI).

If your Cucumber scenarios have words such as “click” or “type” or mention field names and the like, it’s probable that you’re either:
a) Using Cucumber to automate test-execution (not what it was intended for) or
b) Trying to do BDD but writing the scenarios at too low a level of detail.

If you’re doing the latter (b)… then you may find the post I wrote over the weekend useful.

If what you’re doing is (a) then the post I wrote over the weekend might help you go down a path where automating your tests will make the transition to BDD that much easier.

If you really, really only want to automate test-execution then you might find that you’re putting Cucumber where it’s not supposed to go… and just so you know – that’s probably going to hurt 🙂

A bit of UCD for BDD & ATDD: Goals -> Tasks -> Actions

There’s something wrong with many behaviour specs (or acceptance tests). It’s been this way for some time. I’ve written about this once or twice before, referencing this post by Kevin Lawrence from 2007.

So, first things first, I want to take this opportunity to update the terminology I use…

Goals -> Tasks -> Actions
A useful technique used in User-Centred Design (UCD) and Human Computer Interface (HCI) design is Task Analysis. There are three layers of detail often talked about in Task Analysis:

Goal: What we’re trying to achieve which has one or more…
Tasks: The high-level work-item that we complete to fulfil the goal, each having one or more…
Actions: The specific steps or interactions we execute to complete the task.

Previously, the terms used by Kevin, myself (and several others) were Goal->Activities->Tasks. From now on, I’m going to use the UCD/HCI/Task Analysis terms Goal->Task->Action. It’s exactly the same model – just with different labels at the three layers of detail.

What’s wrong with your average behaviour spec?
Let’s look at a common example… The calculator. For convenience, I’m going to borrow this one from the cucumber website[1].

Scenario: Add two numbers
Given I have entered 10 into the calculator
And I have entered 5 into the calculator
When I press add
Then the result should be 15 on the screen

[1]note: I’m using this example for convenience and simplicity. The value of the example on the cucumber website is to demonstrate how easy cucumber is to use and not necessarily as an exemplar feature spec

What would you say the scenario’s name and steps are representing? Goals and Tasks or Tasks and Actions?

I’d argue that adding two numbers is a task and the steps shown above are actions.

Instead, I try to write the spec (or test) with a scenario-specific goal and the steps as tasks.

Scenario: sum of two numbers
When I add 10 and 5
Then I should see the answer 15

I.e.:
Scenario: <A scenario specific goal>
Given <Something that needs to have happened or be true>
When <Some task I must do>
Then <Some way I know I've achieved my goal>

Another Typical Example
Let’s apply this to another typical example that might be closer to what some people are used to seeing:

Scenario: Search for the cucumber homepage
Given I am at http://google.com
When I enter "cucumber" into the search field
and click "Search"
Then the top result should say "Cucumber - Making BDD fun"

Again this is talking in terms of actions. As soon as I’m talking in terms of click, enter, type or other user-action, I know I’m going into too much detail. Instead, I write this:

Scenario: Find the Cucumber home page
When I search for "Cucumber"
Then the top result should say "Cucumber - Making BDD fun"

Why does it matter?
In the calculator example, the outcome the business is interested in for the first scenario is that we get the correct sum of two numbers. The steps in the scenario should help us arrive at a shared understanding of what ‘correct’ means. How we solve the problem of getting the numbers into the calculator and choosing the operator is a separate issue. That’s detail we can defer. It may be better to explore the specifics of the workflow using, sketches, conversation, wire-frames or by seeing and using our new calculator.

So, expressing our scenarios in terms of goals and tasks helps the delivery team and the business arrive at a shared understanding of the problem we’re trying to solve in that scenario.

Expressing the scenario in terms of the clicks, presses and even fields we’re typing into is focusing on a solution, not the problem we’re trying to solve.

But there’s another benefit to doing it this way…

A Practical Concern – Maintainability
The first calculator example follows reverse-polish notation for the sequence of actions:

  1. enter the first number
  2. enter the second number
  3. press add

If we put that detail in the scenarios, that workflow will be repeated everywhere – for addition, subtraction, division, multiplication… and anywhere any calculation is described.

What happens if the workflow for this calculator (such as the UI or API) changes from a reverse-polish notation to a more conventional workflow for a calculator? Like this:

  1. enter the first number
  2. press add
  3. enter the second number
  4. press equals

The correct answer hasn’t changed – i.e. the ‘business rule’ is the same – it’s just the sequence of actions has changed.

Now we have to go through all the feature files and update all the scenarios. In the case of our calculator that may only be a few files covering add, subtract, divide, multiply. For something larger and more complex this could be a lot of work.

Instead, putting this in the code of the step-definition means that this:

When ^I (add|subtract|divide|multiply) (.*) (?:and|from|by)? (.*) do |operator, first_number, second_number|
@calc = @calc ||= Calculator.new
@calc.enter first_number
@calc.enter second_number
@calc.send operator.to_sym
end

Would have to change to this:

When ^I (add|subtract|divide|multiply) (.*) (?:and|from|by)? (.*) do |operator, first_number, second_number|
@calc = @calc ||= Calculator.new
@calc.enter first_number
@calc.send operator.to_sym #moved up from the bottom
@calc.enter second_number
@calc.calculate #new action for the conventional calculator workflow
end

Nothing else would need to change – other than the product of course. By writing the steps as tasks, all of our feature files would still accurately illustrate the business rule even though the workflow of the interface (gui or api) has changed.

Taking your scenario-steps to task
So, taking this approach helps us to reduce our maintenance overheads. Putting the Actions inside the step definition makes our tests less brittle. When the workflow changes we only need to update our specs in one place – the step-definitions.

Of, perhaps, greater importance – by capturing the user’s scenario-specific goal as the scenario name and the tasks as the steps, the team are working towards a shared understanding of the part of the problem we’re trying to solve not a solution we might be prematurely settling upon. When rolled up with the broader context (expressed in the user-story and its other scenarios) it gives us a focus on solving the wider problem rather than biasing the solution.

Sometimes, at first, it’s hard to see how to express the scenario and is easier when people start by talking of clicks, presses and what they’re entering. And that’s ok. Maybe that’s where the team needs to start the conversation. Just because that’s where you start, ask “why are we doing those actions” a few times, and it doesn’t have to be where you end up.

Old Favourite: Taking Repetition To Task

This originally appeared on my old blog on 16th March 2010…

Others have talked about the virtues of stories as vertical slices of a problem (end-to-end capabilities) rather than horizontal slices (system layers or components). So, if we slice the problem with user stories, how do we slice the user-stories themselves?

If, as I sometimes say, acceptance tests (a.k.a. examples/scenarios/acceptance-criteria) are the knife with which we slice a story into even thinner vertical slices, then I would say my observation of ‘tasks’ is that they are used as the knife used to cut a story into horizontal slices. This feels wrong…

Sometimes I also wonder, hasn’t anyone else noticed that the idea of counting the effort of completed tasks on burn-down/up charts is counter to the value that we measure progress only with working software? Surely it makes more sense to measure progress with passing tests (or “checks” – whichever you prefer).

These are two of the reasons I’ve never felt very comfortable with tasks, because:

  • they’re often applied in such a way that the story is sliced horizontally
  • they encourage measuring progress in a less meaningful way than working software

Tasks are, however, very useful for teams at first. Just like anything else we learn how to do, learning how to do it on paper can often help us then discard the paper and do the workings in our heads. However, what I’ve noticed is that most teams I’ve worked with continue to write and estimate tasks long after the practice is useful or relevant to them.

For example, there comes a time for many teams where tasks become repetitive. “Add x to the Model”, “Change View”… and so on. Is this adding value to the process or are you just doing it because the process says you should do it?

Simply finding that your tasks are repetitive doesn’t mean the team is ready to stop using them. There is another important ingredient, meaningful acceptance criteria (scenarios / acceptance-tests / examples).

I often see stories with acceptance criteria such as:

  • Must have a link to save the profile
  • Must have a drop down to select business sector
  • Business sector must be mandatory

Although these are “acceptance criteria” they aren’t what we mean by acceptance criteria in the context of user stories. Firstly, they are talking about how the user interacts rather than what they need to achieve (I’ve talked about this before). Secondly, they aren’t examples. What we want are the variations that alter the behaviour or response of the product:

  • Should create a new profile
  • Profile cannot be saved with blank “business sector”

As our product fulfils each of these criteria, we are making progress. Jason Gorman illustrates one way of approaching this.

So, if you are using tasks, consider an alternative approach. First, look at your acceptance criteria, make sure they are more like examples and less like instructions. Once that’s achieved, consider slicing each criterion (or scenario) horizontally with the tasks rather than the story. Pretty soon, you’ll find that you don’t need tasks anymore and you can simply measure progress in terms of the new capabilities you add to your product.


Old Favourite: QA / Testing – what’s the difference?

Software is about the only industry one of the few industries that lumps testing and QA under one banner. It’s one of those things where common misuse of a term results in the community changing it’s meaning… this happens in mainstream language all the time.

Testing something is actually more analogous to quality control or, QC , [although it isn’t quality control].

QA is more concerned with the process – collecting information about the performance of the process in order to determine if we are ‘assuring’ (or more realistically increasing the probability of) quality. Statistical information about problems found in the product (during quality control) is just one of many pieces of info useful to someone concerned with QA (which really should be the whole project team)

In short:

QC helps us answer the question ‘does our product work?’

QA helps us answer the question ‘does our process work?’

Unfortunately, in the software industry, all too many teams don’t realise their process doesn’t work until the testers find all the ways in which the product doesn’t work… maybe that’s why software testing has come to be known as QA.

This article originally appeared on my old blog in July 2008. I had already elaborated on this topic in the article “What’s in a word” in Better Software Magazine in March 2008.

You’re almost cuking it…

In “You’re Cuking it Wrong”, Jonas Nicklas, shows several examples of bad scenarios (or acceptance tests whichever term you prefer) and demonstrates better approaches. This is an excellent post on common mistakes made when writing example scenarios with Cucumber.

I think, however, he could have gone further in one case. One of his examples of a bad scenario looks like this:

Scenario: Adding a subpage
Given I am logged in
Given a microsite with a Home page
When I click the Add Subpage button
And I fill in "Gallery" for "Title" within "#document_form_container"
And I press "Ok" within ".ui-dialog-buttonpane"
Then I should see /Gallery/ within "#documents"

(Dude – yep, seen these… I agree… not good). He goes on to suggest it should really look like this:

Scenario: Adding a subpage
Given I am logged in
Given a microsite with a home page
When I press "Add subpage"
And I fill in "Title" with "Gallery"
And I press "Ok"
Then I should see a document called "Gallery"

This is a massive improvement. It keeps the specifics that inform the reader and give them some context (like filling in the “Title” with “Gallery”) and takes them further away from the implementation. Although it gets closer to expressing customer intent, I think it could go further. At the moment, it is describing the ‘what’ and some of the ‘how’. This example makes complete sense if what we are exploring is the design of the UI. I’ve not found these scenarios to be a good place to do that, however.

Instead, in these specifications we want our examples to illustrate the customer intent. The ‘what’ not the ‘how’. There are other places we can capture the ‘how’ – i.e. in the step methods.

Instead, I would write it like this:

Scenario: Adding a subpage
Given a microsite with a home page
Given I am logged in
When I Add a Subpage with a Title of "Gallery"
Then I should see a document called "Gallery"

I’ve removed all the “Tasks” and left only “Activities”. This leaves the user experience completely open. This ensures that when there are UI changes, I only change the code that performs the tasks (clicking, pressing, etc.) and my scenarios evaluate whether the customer intent is still fulfilled – without having to go back and change a lot of files.

Otherwise – great post Jonas 🙂

Old Favourite: Expected Exceptions

This first appeared on my old blog in November 2008.

I’ve decided that I don’t like typical patterns for testing exceptions. I decided this a while ago as far as “Expected Exception” attributes/annotations are concerned and stuck with the traditional try/catch approach (I’ll explain why in a minute). Now, I’ve decided I don’t like the typical “try/fail or catch” approach and have started using a subtle evolution of it.

First, let me explain why I don’t like Expected Exception attributes/annotations. The final nail in the coffin of this approach was hammered home for me when working with Liz Keogh a while back.

Here is an example in Java of your expected exception pattern (for brevity I won’t include assertions for e.getMessage()):

 

@Test(expected=BeyondMyExpertiseException.class)
public void shouldComplainWhenNotAClass() throws Exception {
	DomainExpert expert = new DomainExpert();
	String thisCheckpoint = nameOfSomeInterface();

	expert.howDoIRestore(thisCheckpoint);
}

 

So, apart from the obvious fact that it is only implicit as to which method threw that exception (because I know that none of the other steps can throw that exception)… and we want our tests to communicate information explicitly, yes? The insight that Liz shared with me is that it changes the flow of information (ok, I’m paraphrasing now) compared to a test that doesn’t expect an exception.

In a ‘positive’ test, the flow of information that is expressed to the reader is What I need->what I do->what I expect. In an expected exception test, this is changed to what I expect->what I need->what I do. The latter just doesn’t flow very well and because it’s different to your positive tests there’s an overhead involved for the reader (me or someone else later on) to process this shift in structure… I’ve found that such tests just don’t jump out at me when I’m scanning the tests…

Since then, despite fashion, I committed to the old-fashioned way of writing exceptions – “try/fail or catch”:

 

@Test
public void shouldComplainWhenNotAClass() throws Exception {
	DomainExpert expert = new DomainExpert();
	String thisCheckpoint = nameOfSomeInterface();

	try {
		expert.howDoIRestore(thisCheckpoint);
		fail("Should have thrown " +
			BeyondMyExpertiseException.class.getSimpleName());
	} catch (BeyondMyExpertiseException e) {
	}
}

 

Ok, I accept, it looks more cluttered by comparison but the flow of information makes more sense and I make it explicit that the expert.howDoIRestore(thisCheckpoint) method call is the one that should have thrown the exception. (Note: The idea here is not to reduce how much you type but to make the test more expressive). The “try/fail or catch approach only works when your code doesn’t throw an exception… If your code throws a different exception, the failure trace just tells you what exception was actually thrown, it doesn’t tell you what exception was expected. So, here is a slightly different way of writing it:

 

@Test
public void shouldComplainWhenNotAClass() throws Exception {
	DomainExpert expert = new DomainExpert();
	String thisCheckpoint = nameOfSomeInterface();
	try {
		expert.howDoIRestore(thisCheckpoint);
		fail();
	} catch (Exception e) {
		assertThat(e,is(instanceOf(
				BeyondMyExpertiseException.class)));
	}
}

 

Notice that I’m only catching Exception now, not BeyondMyExpertiseException. This still feels a little jumbled… Because my assertion is inside the catch block, I have to have the fail() method call just after the call that should throw the exception. Hmmm… don’t like that… Instead, this makes more sense:

 

@Test
public void shouldComplainWhenNotAClass() throws Exception {
	DomainExpert expert = new DomainExpert();
	String thisCheckpoint = nameOfSomeInterface();
	Exception thrownException = null;

	try {
		expert.howDoIRestore(thisCheckpoint);
	} catch (Exception e) {
		thrownException = e;
	}
	assertThat(thrownException,is(instanceOf(
				BeyondMyExpertiseException.class)));
}

 

Giving this failure trace when it fails:

 

java.lang.AssertionError:
Expected: is an instance of
com.testingreflections.atdd.expertise.
    misunderstanding.BeyondMyExpertiseException
        got: < java.lang.UnsupportedOperationException >

 

So, for those who want to type as little as possible, this isn’t for you… But if you want tests that drive out your exception handling to be more expressive, then this is an alternative to the usual “try/fail or catch” approach. I think that perhaps there’s an even better way of doing this… maybe next I’ll see how approximating closures with an anonymous class might help improve the readability of this… Let me know if you know of a better way.

Boredom: a Testing Smell?

This first appeared on my old blog in June 2005.

Somebody I know who was doing some (unscripted) testing spoke of being bored the other day… I have always found boredom to be a sign that something is wrong.

I believe, as has been said by Kaner et al, that testing is a brain-engaged activity. If that is the case, why would I ever be bored?

Borrowing the Smell metaphor… I would say that boredom is a Bad Testing Smell. If it isn’t a bad smell, it is a whiff of an underlying bad-smell for sure.

If on the rare occasion I find that I am bored, I’d ask myself:

  1. Am I testing this area more than I need to? (if so, why?)
  2. Am I losing concentration because my brain is tired? Do I need a break?
  3. Is what I am doing so repetitive that perhaps it should be automated?
  4. Is there a better way of testing this feature?

If I answer “yes” to any of these, I know that perhaps I need to do something differently. Whether that is feasible in a given context, might be a different story.

Update 9th July 2010: This applies to scripted testing as much as it does to exploratory testing. Generally, I’ve found that boredom during scripted testing is because we’re asking a human to do something we should be asking a computer to do… The main barrier to this, in my experience, is that it is harder to write the automated tests than to write manual scripted tests… so, I look for ways to make automated tests as easy to write as manual scripted ones. Usually, I can achieve this with a little effort and have found it makes a huge difference.