Tag Archives: Software engineering

This is not a manifesto: Valuing Throughput over Utilisation

In a previous article, This is not a manifesto, I expressed the values I hold as a software development team member. Today, I’m going to talk about the first of these values.

Before I do, I’d like to say what I mean by “software development team”. I mean a cross-discipline team with the combined skills to deliver a software product – product owner, user experience, business analysts, programmers, testers, dev-ops, etc.

A common problem

Many teams I encounter will, at least in the beginning, have team-members who specialise in a single role. Each team-member will rarely, if ever, step outside their job description [1]. This can cause a problem.

Many teams find themselves in a situation where some team members have little to do on the current stories. At this point, the team has some choices, they can focus the under-utilised people on:

  1. Future user-stories, working on tasks most relevant to their job title
  2. Another team, working on tasks most relevant to their job title (e.g. ‘matrix management’)
  3. Current stories, taking on tasks that will take the story to completion sooner, even if it’s more relevant to someone else’s job title
  4. Things that will make the more utilised people get through their work faster, today and in the future.

More often than not, I find teams taking option 1. I think that people choose this option because it feels like it should increase the throughput of the team – i.e. the amount of new features they can add to the product. In fact, it has the opposite effect.

Little’s law

Let’s consider a team that is working in time-boxes or ‘sprints’ (à la Scrum) and measures it’s throughput with ‘velocity’ (story-points per-sprint). Points are accrued as each user-story is completed – i.e. coded, tested and product-owner validated.

In this particular team, it is able to complete an average of 10 points per sprint. Let’s say this is due to a bottleneck in the process. This bottleneck might limit the amount of testing that can be completed during the sprint. This is illustrated in fig.1 where each ball is a story point.


Shows a system through which a series of balls are being processed. Each ball represents one story point. It also shows that there is spare capacity in the team.

Shows a system through which a series of balls are being processed. Each ball represents one story point. It also shows that the spare capacity in the team has been used up, even though it does nothing to increase throughput.

Little’s law [2] tells us that the amount of time it takes to complete an item of work (cycle-time) is:

        Work in Progress (WIP)   

Which we can read as:

   story points in progress

In this simplified example, WIP is 10 points and throughput is 10 points per sprint. The average cycle time of a story is therefore 10/10 = 1 sprint. Notice, however, there’s all that spare capacity.

Let’s say this spare capacity is developers. So, the team starts taking on more user-stories from the backlog – increasing the number of story points in progress to 20 points (fig.2).

Because the bottleneck remains, velocity remains the same – 10 points per sprint. The amount of flexibility that the team has, however, is now reduced because the average time required to get a story to completion has increased from 1 sprint to 20/10 = 2 sprints [3].

Often this will take the form of testers working a sprint behind the developers. Worse still, over 10 sprints, the team still only completes 100 story points (as they would have before) but is left with a lot of unfinished work. This ‘inventory’ of unfinished work carries overheads which can, ultimately, reduce the velocity of the team.

The impact of filling capacity in this way has yet another effect.

Latency effect

As the developers get through more stories, a queue of stories that are “ready for testing” will build up. Testers working through these will, at least, have questions for the developer(s) or even find some defects. The developer, having moved on from that story, is now deep into the code and context of another story. When the tester has questions, or finds defects, relating to a story that the developer had worked on, say a week ago, then the developer has to reload their understanding of this old code and context in order to answer any questions or fix any defects. This context-switching carries significant overheads [4].

The end result is that the effort required to complete a story increases due to the repeated context switching, therefore reducing velocity.

So, not only does filling the capacity with more work fail to increase throughput, it adds costly context-switching overheads – ultimately slowing everything down.

This phenomenon is not unique to teams using fixed-length time-boxes, such as sprints. Strictly following Kanban avoids this problem, but what I’ve seen is some teams creating a ‘ready for testing’ queue – so that developers can start work on the next story. This has the same latency effect and turns a process designed for continuous flow into a batch and queue process. But, I digress.

What to do with the spare capacity?

The simple answer is to look at the whole approach and determine what is slowing things down. In the example above, I’d be wondering what’s slowing the testing down. Many of the things that hinder testing can be addressed by changing how we do things ‘upstream’.

Are lots of defects being found, causing the testers to spend more time investigating and reproducing them? Can we get the testers involved earlier? Can any predictable tests be defined up front so that developers can make sure the code passes them before it even gets to the testers (e.g. Behaviour Driven Development)?

Are the testers manually regression testing every sprint? Could the developers help by automating more of that? Are the testers having to perform repetitive tasks to get to the part of the user-journey they’re actually testing? Can that be automated by the developers?

Is there anything else that is impacting them? Test data set-up and maintenance, product owner availability to answer questions? Anything else?

Addressing any of these issues is likely to speed up the testing process and increase throughput of the entire team as a result. One solution is to put these types of tasks onto the product backlog. This is fine but if we assign point values to them it can give a skewed view of velocity. Or rather, velocity is no longer a measure of throughput. You won’t be able to see if these types of tasks are actually improving things unless you are also measuring what proportion of the points are delivering new product capabilities.

The only good reason I can think of for story-pointing these throughput-enhancing tasks is if your focus is utilisation – i.e. maximising the number of story-points in progress. Personally, I care more about measuring and improving throughput. By doing so, we get the right utilisation for free and a faster, more capable team.

Up next: Valuing Effectiveness over Efficiency.


[1] “Lessons Learned in Close Quarters Battle” Illustrates how stepping outside our job descriptions can move the team through each story more quickly by using special-forces room-clearing as an analogy.

[2] Little’s law (PDF) – the section “Evolution of Little’s Law in Operations Management” that references Hopp and Spearman’s observation about throughput (TH), work-in-progress (WIP) and cycle-time (CT) – i.e. TH=CT/WIP. And therefore we can say CT = WIP/TH.

[3] Little’s law also illustrates that if we reduce the work in progress to one quarter, then the cycle time for each story reduces to one quarter of a sprint. We’ll still get through 10 points per sprint, but stories will be completed more as a continuous stream throughout the sprint rather than all at the end.

[4] “The Multi-Tasking Myth” by Jeff Atwood, talks about multi-tasking across projects and pulls together several resources to illustrate the impact of multitasking at various levels. This applies when multi-tasking across stories.


I’d like to say a special thank you to my fellow RiverGliders – Andy Palmer and James Martin – for the feedback that helped me refine this article.

Old Favourite: Adaptive Budgets? “Pull” the other one!

This was originally posted on my old blog on 10th April 2010

Recently, I wrote about my views on using and estimating with task-cards. I highlighted that tracking progress with burn-up/down charts showing effort completed/remaining is not a true measure of progress, especially if we subscribe to the idea that we measure progress with working-software.

I also highlighted how tasks “horizontally slice” a “vertically sliced” story.

This inspired an article on infoq by Mark Levison. Since its publication, there have been several comments.

There are some specific points I’ll be answering on infoq, but some general points are more easily answered here…

True measures of progress
One of the key points I was trying to make in my first post (linked to above) is that if “working software” is our measure of progress then tracking task completion is not consistent with that.

One of the problems task-hours are trying to solve is to provide a visible indication of progress. It’s also something that many people find easier to understand. The idea of relative-points estimation seems to baffle many people when they first encounter it.

In my original post (linked above) I referenced an approach, blogged about by Jason Gorman, outlining a solution to providing a visible indication of progress that is compatible with the idea of measuring progress with working software. It also works more seamlessly with the solution of measuring throughput trends of the team (e.g. with story points) so that the team can estimate how much can be done in the next iteration.

Value Points & Task Points
Value points were mentioned in the infoq comments. Value points are solving a different problem and don’t help the team estimate what can be done. The value of a story isn’t an indication of its complexity. Value, combined with complexity points, can help determine priorities. Arlo Belshey explains some interesting views on this.

Prioritisation is, in my view, one of the few reasons to place some sort of estimate on value & complexity. Unfortunately, it seems to be often used to establish a contract with the team for when something will be done and apply pressure when the estimates turn out to be inaccurate.

Non-estimating pull systems
My preference is to use such estimates only for prioritisation… after that, things just take as long as they take. There is some benefit in tracking accuracy trends so that we can learn from them, however, spending too much time on these things simply slows down the process of getting things done. Interestingly, I have not yet seen the business be quite so passionate about evaluating whether their value estimates were accurate when the product makes it to the real world. Funny that.

Instead, I prefer to leave estimation behind once we’ve prioritised things and pull stories or customer valued work items through the system without using previous estimates to predict their completion date. With the way that most budgets are determined and allocated, this idea isn’t compatible with the way much of the business world works. For pull systems that eliminate estimation as a means of predicting the future to work (such as Kanban), we need a more adaptive approach to budgetary spend. Business needs to find a way to adapt budget allocation more frequently, perhaps as frequently as monthly, perhaps more frequently still. The business now needs to look at how it can respond to change rather than focus on following a budget-plan.

Business-world: now it’s your turn
Budget holders now need to be more agile. We, those who evolve the implementation of the ideas of the business, have responded to their demands to be able to respond to rapidly changing markets and provide that competitive edge. For the more mature implementation teams, the hindrance now is no longer how we make the products, but the business culture of inflexible and predictive budget allocation.

Old Favourite: Taking Repetition To Task

This originally appeared on my old blog on 16th March 2010…

Others have talked about the virtues of stories as vertical slices of a problem (end-to-end capabilities) rather than horizontal slices (system layers or components). So, if we slice the problem with user stories, how do we slice the user-stories themselves?

If, as I sometimes say, acceptance tests (a.k.a. examples/scenarios/acceptance-criteria) are the knife with which we slice a story into even thinner vertical slices, then I would say my observation of ‘tasks’ is that they are used as the knife used to cut a story into horizontal slices. This feels wrong…

Sometimes I also wonder, hasn’t anyone else noticed that the idea of counting the effort of completed tasks on burn-down/up charts is counter to the value that we measure progress only with working software? Surely it makes more sense to measure progress with passing tests (or “checks” – whichever you prefer).

These are two of the reasons I’ve never felt very comfortable with tasks, because:

  • they’re often applied in such a way that the story is sliced horizontally
  • they encourage measuring progress in a less meaningful way than working software

Tasks are, however, very useful for teams at first. Just like anything else we learn how to do, learning how to do it on paper can often help us then discard the paper and do the workings in our heads. However, what I’ve noticed is that most teams I’ve worked with continue to write and estimate tasks long after the practice is useful or relevant to them.

For example, there comes a time for many teams where tasks become repetitive. “Add x to the Model”, “Change View”… and so on. Is this adding value to the process or are you just doing it because the process says you should do it?

Simply finding that your tasks are repetitive doesn’t mean the team is ready to stop using them. There is another important ingredient, meaningful acceptance criteria (scenarios / acceptance-tests / examples).

I often see stories with acceptance criteria such as:

  • Must have a link to save the profile
  • Must have a drop down to select business sector
  • Business sector must be mandatory

Although these are “acceptance criteria” they aren’t what we mean by acceptance criteria in the context of user stories. Firstly, they are talking about how the user interacts rather than what they need to achieve (I’ve talked about this before). Secondly, they aren’t examples. What we want are the variations that alter the behaviour or response of the product:

  • Should create a new profile
  • Profile cannot be saved with blank “business sector”

As our product fulfils each of these criteria, we are making progress. Jason Gorman illustrates one way of approaching this.

So, if you are using tasks, consider an alternative approach. First, look at your acceptance criteria, make sure they are more like examples and less like instructions. Once that’s achieved, consider slicing each criterion (or scenario) horizontally with the tasks rather than the story. Pretty soon, you’ll find that you don’t need tasks anymore and you can simply measure progress in terms of the new capabilities you add to your product.

Monsters, Names, Pot-Roast & The Waterfall Model

“Antony” (without the ‘H’) is the anglicised version of Antonius. In victorian times (there or thereabouts I’m guessing), among those wishing to appear oh so intelligent, gossip spread that the spelling of “Antony” was wrong… For, so they would say, it is born of the greek word “anthos” (meaning “flower”) – oh dear… so many poor children with misspelt names… 

Despite being completely wrong, the world forgot of my name’s etruscan origin and spelt it with an ‘H’… This misinformation established itself through the eras so much so that, today, the de-facto spelling is “Anthony”. It has even found it’s way into the American pronunciation of the name as: “An-thon-ee”.

Waterfall development has something in common with this story… somehow, through misinformation, what it once was has been warped, into something else.

The key difference is that Waterfall is now increasingly represented as was originally intended. Unfortunately for me, my name is not…


Monsters & Legends

Some might think that the Waterfall Model is an approach to software development, first explained (but not named) in Winston Royce’s 1970 Paper “Managing the Development of Large Software Systems” (PDF), but they could be wrong…

Somehow, it seems to have become something else… it became the way (many) people thought software should be developed… the norm for software ‘professionals’. Years of anecdotal failure followed and Waterfall became a legend – told time and again much like a scary camp-fire story… The enemy of effective software development… A monster that will consume all the resources it can, spewing out nothing but documentation, rarely concluding in working software – at best 20% of the time.

This negative view, to what was once the de-facto approach to software development, is actually far closer to Royce’s original words on Waterfall than many seem to know…


The Truth & Technology

In Royce’s original paper, he shows a progression of activities, that came to be known as the waterfall model.

What we rarely hear of is Royce’s original words on the subject:

“…the implementation described above is risky and invites failure.”

Further to this, Royce goes on to explain that the reason that this cannot work is because there are too many things we cannot analyse up-front:

“The testing phase which occurs at the end of the development cycle is the first event for which timing, storage, input/output transfers, etc., are experienced as distinguished from analyzed. These phenomena are not precisely analyzable. They are not the solutions to the standard partial differential equations of mathematical physics for instance.”

He explains that we need feedback loops. He goes on to warn of (a conservative) 100% overrun in schedule and costs:

“…invariably a major redesign is required. A simple octal patch or redo of some isolated code will not fix these kinds of difficulties. The required design changes are likely to be so disruptive that the software requirements upon which the design is based and which provides the rationale for everything are violated. Either the requirements must be modified, or a substantial change in the design is required. In effect the development process has returned to the origin and one can expect up to a 100-percent overrun in schedule and/or costs.”

Some of Royce’s strategies, like “Involve the customer” and obtaining early feedback, have lived on in modern (Agile) methodologies. Beyond that, we should remember that his specific recommendations on how to solve the problems of a waterfall model were all based on the technology of the time.


Pot-Roast & The Cost of Change

In 1970, computing was much more expensive than it is today. In those days, changing software was far more expensive than changing pictures and words on paper. It was also much harder to express your design in a human-friendly way in the programming languages of that time. As a result of these and other factors, documentation was a major part of how Royce tried to solve the inherent problems of the Waterfall model.


Technology, tools & thinking have moved on and our documentation no longer needs to be static. It lives. It can breath. The specification can automatically verify that the implementation does what we said it should do (e.g. as in BDD Specs or ATDD/TDD Tests). Modern programming languages allow us to express the design and our understanding of the domain far more clearly, negating the need to first detail our thoughts on paper in natural language. We simply don’t have to cut the ends off that pot-roast anymore.


Only now, at the end…

Waterfall, thanks to the popularity of Agile, has gone from something I was shown at school as “how software is developed” to being seen in the light that it was originally presented – how software should not be implemented.

This is despite those who still profess the legitimacy of Waterfall and those still shocked and surprised when they hear of Royce’s own words against the monster he unintentionally created.

As for my name, I hold out little hope for change. I doubt that the world will use my name as it was originally intended and so I have resigned myself to needing two domain names… one with an ‘H’ in it, and the correct one without – I wonder which one brought you here.