Archive for the ‘Agile Estimating’ Category

Separate Estimating from Committing

Tuesday, February 9th, 2010

A fundamental and common problem in many organizations is that estimates and commitments are considered equivalent. A development team (agile or not) estimates that delivering a desired set of capabilities will take seven months with the available resources. Team members provide this estimate to their manager who passes the estimate along to a vice president who informs the client. And in some cases the estimate is cut along the way to provide the team with a “stretch goal.”

The problem here is not that the team’s estimate of seven months is right or wrong. The problem is that the estimate was turned into a commitment. “We estimate this will take seven months” was translated into “We commit to finishing in seven months.” Estimating and committing are both important, but they should be viewed as separate activities.

I need to pick up my daughter from swim practice tonight. I asked her what time she’d be done (which we defined as finished swimming, showered, and ready to go home). She said, “I should be ready by 5:15.” That was her estimate. If I had asked for a firm commitment—be outside the facility by the stated time or I’ll drive away without you—she might have committed to 5:25 to allow herself time to recover from any problems, such as a slightly longer practice, the coach’s watch being off by five minutes, a line at the showers, and so on. To determine a time she could commit to, my daughter would still have formed an estimate. But rather than telling me her estimate directly, she would have converted into it a deadline she could commit to.

Don’t let your estimates become commitments. Remember the difference between an estimate and a commitment and keep the two activities separate, educating management and customers as necessary. I talk much more about agile estimating and committing in my new book, Succeeding with Agile.

How Do Story Points Relate to Hours?

Sunday, February 8th, 2009

I’m often asked about the relationship between story points and hours. People who ask are usually looking for me to say something like “one story point = 8.3 hours.” Well, that just isn’t the case (especially since I made up 8.3 hours). Let’s see what the real relationship is between a story point and hours…

Suppose for some reason you have tracked how long every one-story-point story took to develop for a given team. If you graphed that data you would have something that would look like this:

Number of hours to develop various one-point stories

This shows that some stories took more time than others and some stories took less time, but overall the amount of time spent on your one-point stories takes on the shape of the familiar normal distribution.

Now suppose you had also tracked the amount of time spent on two-point user stories. Graphing that data as well, we would see something like this:

Number of hours to develop various one- and two-point stories

Number of hours to develop various one- and two-point stories

If the one-point stories are centered around a mean of x, ideally the two-point stories will be centered around a mean of 2x. This will never be exactly the case, of course, but a team that does a good job of estimating will be sufficiently close for reliable plans to be made from their estimates.

What these two figures show us is that is the relationship between points and hours is a distribution. One point equals a distribution with a mean of x and some standard deviation. The same is true, of course, for two-point stories, and so on…

By the way, notice that I’ve drawn the distributions of one- and two-point stories as having overlapping tails. It should be totally realistic that the biggest story that a team put “one story point” on might turn out to take more time than the smallest story they put a two on. After all, no team can estimate with perfect insight, especially at the story point level. So, while the tails of the one- and two-point distributions will overlap, it would be extraordinarily unlikely that the tails of, say, the one- and thirteen-point distributions will overlap.

Is It a Good Idea to Establish a Common Baseline for Story Points?

Saturday, August 9th, 2008

In my previous post, I wrote about how to establish a common baseline for story points across relatively large teams (a few hundred developers). In this post I want to consider whether doing so is a good idea.

The need for a common baseline to story points usually arises from the reasonable desire to know how big the entire project is. To know that, we must know the size of the work to be done by each team. Unfortunately, along with this goal comes the ability to compare teams based on their velocities. Since many managers are constantly looking for ways to compare team and individual performance it is not surprising that they begin to make such velocity comparisons. Almost all such comparisons are disruptive to performance of the combined, overall group or department.

A chart such as the one that follows can show a lot of interesting information.

Velocities before teams told they would be compared

However, this chart can be very dangerous because of how teams will assume the data is being interpreted. Shown a chart like this a common team response will be to feel that they need to faster than the other teams. Achieving this additional speed may come from working in a more focused manner (a good thing), but it may come instead from sacrificing quality, leaving important refactorings undone, or a variety of other not-so-good manners.

Some teams may respond to the pressure for their abstract measure of velocity to increase by gradually inflating the number of story points assigned to a story. This can happen in subtle and not particularly nefarious ways that can accumulate into large problems. Consider, for example, a team that is arguing over whether a particular story should be estimated at 5 or 8 points. If the team is under pressure (real or just perceived) to increase velocity they will be more likely to assign the 8. The next story the team considers is slightly larger. They compare it to the newly assigned 8 and decide to give it a 13. Without pressure to improve velocity, this same team may have given the first item a 5 and the second (slightly larger still) item an 8. In this one scenario the team has inflated their points from 5+8=13 to 8+13=21, or more than 50%. Story point inflation such as this tends to happen very quickly if it happens at all.

Consider what happened in the next few iterations for the four teams shown in the previous figure.

Four teams and their velocities

Not surprisingly, someone in the Project Management Office distributed the chart showing the similarities over the first three iterations. Two of the teams reacted by instantly inflating their story points. After seeing that, the yellow team followed suit. The green team is either extremely virtuous or they haven’t noticed the charts yet.

So, should you establish a common baseline? Yes, if there are advantages to doing so on your project. If you do, however, you need to make sure you go out of your way to create safety around that baseline for the teams. Stress that this isn’t being done as a way to compare teams and that you (and your bosses know) that there are many factors that influence velocity, not just “how good” a team.

Establishing a Common Baseline for Story Points

Wednesday, August 6th, 2008

A common criticism of story points is that the meaning of a story point will defer between teams. In this post I want to describe how can we establish a common definition of a story point across multiple teams within an organization.

The best way I’ve found to do this is to bring a broad group of individuals representing various teams together and have them estimate a dozen or so product backlog items (ideally in the form of user stories in my opinion). Not each estimator needs to understand every item but most people should understand most items. The items being estimated do not need to be new items; some could be from a project finished recently that many estimators remember or worked on. Some items could be artificial; perhaps the team is asked to estimate, “a typical transaction activity report.” If that meant something to most estimators, it would be a good candidate item.

I’ve done with this 46 people in a large conference room–44 estimators plus me and a coach from my client who wanted to watch so he could moderate such a meeting the next time one would be needed. The 44 estimators represented 22 teams; two estimators per team were in the meeting. If you’ve seen or used the Mountain Goat planning poker cards, you’ll have noticed that they feature a very large number in the middle (plus the number in a smaller font in the corners). We could have done something cute like put eight little goats on the eight card. We put the very large number there deliberately, though: We wanted it to be visible across a potentially large conference room.

You can probably imagine how difficult it might be to gain consensus among 46 people playing planning poker. While it will not take proportionately longer to derive estimates, it does take quite awhile with that many people. I think it took us about two hours to estimate twelve items.

But when that meeting was over, each pair of estimators went back to their teams with twelve estimates. Those estimates could then be used as the basis for estimating future work. As each team estimated new product backlog items they would do so by comparing them to the initial 12 plus any estimates that had been produced since (by them or any other team).

I’ll blog next about when it may or may not be a good idea to establish such a common baseline.

When Should We Estimate the Product Backlog

Sunday, March 16th, 2008

I was recently emailed a question asking whether the sprint planning meeting should start with time allocated for putting story point estimates on any user stories that have not yet been estimated.

No, I don’t think this is a good idea. Keep in mind that we put estimates on product backlog items (which I recommend be user stories) so that:

  • the product backlog can be prioritized. It is impossible to fully prioritize a set of items without knowing at least their relative cost.
  • we can make high-level forecasts about how much will be done by when
  • we can make tradeoff decisions between scope and schedule

We can achieve these goals with approximate, relative estimates such as given by story points. For example, if I decide to buy a new car this weekend it is sufficient for me to know that I can get a Toyota or Honda for around $30,000 and that a Ferrari goes for closer to $800,000. I do not need to know a more precise cost of the Honda ($31,850) before knowing it should be on my short list of cars to evaluate while the Ferrari should not.

Sprint planning meetings typically go into deeper detail than is appropriate for product backlog item estimating (whether in story points or ideal days). Since we become accustomed during sprint planning to breaking user stories into tasks and considering those tasks in more detail, there is a chance this will carry over into any story point estimating done during the same meeting.

So, when should story point estimating happen? I’ll describe the ideal case, which you can easily adjust for the real-world intrusions on your project. Projects typically start in either of two ways:

  1. A reasonably fully stocked product backlog is written before the first sprint begins and all items are estimated before the first sprint planning meeting. [If you do this, be careful not to write all user stories with too much detail. Each product backlog item you write represents an investment. User stories should therefore be elaborated just-in-time and in just-enough detail that they can be turned into functionality in one sprint.]
  2. We know we’ve got to do the project so we dive in. During the first sprint or two, all the user stories are written and estimated just like above.

On an ongoing basis, once per sprint I recommend that the ScrumMaster tell the team something like, “Hey, we’ve had five new user stories come in this sprint and we need to estimate them. Everyone plan on hanging around for a bit after tomorrow’s daily scrum meeting and we’ll play Planning Poker to estimate the new items.” Doing it right after the daily scrum helps cut down on the number of interruptions in total. I usually aim for having that meeting about two days before the end of the sprint. That way the product owner will have estimates on them so she can prioritize prior to the start of the sprint.

Why I Don’t Use Story Points for Sprint Planning

Thursday, November 8th, 2007

As described in Agile Estimating and Planning, I’m a huge fan of using story points for estimating the product backlog. However, I also recommend estimating the sprint backlog in hours rather than in points. Why this seeming contradiction?

I’ve previously blogged on the reasons why I recommend using different estimation units (points and hours) for the different backlogs. But I’m often asked this related question I want to address here:

I’m curious why you aren’t using story points to do your sprint planning.  I thought that the point of measuring story point velocity was partly to determine how much we can take on (or commit to) in a sprint.  Do you only use story points for longer-term planning (e.g. release planning)?

I don’t use story points for sprint planning because story points are a useful long-term measure. They are not useful in the short-term. It would be appropriate for a team to say “We have an average velocity of 20 story points and we have 6 sprints left; therefore we will finish about 120 points in those six sprints.” It would be inappropriate for a team to say, “We have an average velocity of 20 story points so we will finish in the next sprint.” It doesn’t work that way.

Suppose a basketball team is in the middle of their season. They’ve scored an average of 98 points per game through the 41 games thus far. It would be appropriate for them to say “We will probably average 98 points per game the rest of the season.” But they should not say before any one game, “Our average is 98 therefore we will score 98 tonight.”

This is why I say velocity is a useful long-term predictor but is not a useful short-term predictor.

Velocity will bounce around from sprint to sprint. That’s why I want teams to plan their sprints by looking at the product backlog, selecting the one most important thing they could do, breaking that product backlog item / user story into tasks and estimating the tasks, asking themselves if they can commit to delivering the product backlog item, and then repeating until they are full. No discussion of story points. No discussion of velocity. It’s just about commitment and we decide how much we can commit to by breaking product backlog items into tasks and estimating each. This is called commitment-driven sprint planning.

When a team finishes planning a sprint in this way it is indeed likely that the number of story points they have unknowingly committed to should be close to their long-term average but it will vary some. It will also be true that a team will commit to approximately the same number of hours from one sprint to the next. I use the term capacity to refer to this number of hours because velocity is reserved for referring to measuring the amount of work planned or completed as given in the units used to estimate the product backlog (which I recommend be done using story points).

Don’t Average During Planning Poker

Thursday, October 11th, 2007

I like to use Planning Poker to estimate the user stories on an agile team’s product backlog. In this approach individual estimators hold up cards showing their estimates. If estimators disagree they discuss why, ask questions of their product owner (who should be present), and repeat until they come to consensus. Team members often ask me whether they really need to come to consensus or  whether they can just take the mean of the individual estimates.
The problem with averaging is that it is too easy–rather than have the fierce discussion that is one of the huge benefits of playing Planning Poker teams fall into a trap of playing one or two rounds and then just averaging. An obvious dysfunction is that one estimator may play the 100 card not because he thinks it will take that long but because he thinks 20 is the right number and other estimators are thinking 8 and 13. For this reason and others, if a team truly feels compelled to average, they should take the median (middle value) rather than the mean (sum of estimates divided by number of estimates).

A lot of dark corners are enlightened through the discussion; teams lose out on that when they average. So while I want teams to come to agreement, I don’t care how heartfelt the agreement is. If we agree on 13 some of us may really believe that’s the right number. Others may think 8 is right but that 13 is “close enough.” Still others may think we’ve discussed the item too long and even though it should be a 20 will give in and call it a 13 just to be done with it.

So, rather than average if the team is an impasse I suggest going another round. If still stuck, someone should suggest a reasonable number and see if everyone can “support it” rather than “think it’s the absolutely perfect number.”

To Re-estimate or not; that is the question

Sunday, September 2nd, 2007

Should a team that is estimating in story points ever re-estimate? This is a question I’m commonly asked and would like to address here.

Most people have a natural feeling that re-estimating is somehow wrong but they can’t quite say way. I’ll encourage those individuals to stick to their hunches, and hopefully I can provide of the reasoning that supports your natural inclination that most re-estimating is wrong. Philosophers talk about two types of knowledge. The first is a priori knowledge, which is knowledge before you experience something. Let’s call this knowledge-before-the-fact. This is the type of knowledge we have when we estimate something. Before I estimating development of the new search screen I think it’s about 8 story points, because it seems to be about the same total effort as some other 8 point story. The other type of knowledge is called a posteriori knowledge by the philosophers. This is knowledge after the fact.

When we estimate it is important that we not mix knowledge-before-the-fact with knowledge-after-the-fact. Suppose you are looking at a Scrum product backlog that has just been estimated with none of the work started. Each of those estimates was given before-the-fact (a priori). Now suppose you are looking at the same project a few months later. You’ve got a list of completed work, some of the items on that list still show their original, before-the-fact estimates but some have been re-estimated with after-the-fact estimates. The product backlog is similarly mixed: mostly the initial, before-the-fact estimates but some estimates that have been revised after-the-fact because of what was learned by developing previous user stories off the backlog.

Having both before-the-fact and after-the-fact estimates on your product backlog and list of finished work can cause a lot of confusion for the project. When all estimates are given in before-the-fact numbers we can reason about them and compare them. Suppose the team is estimating a new item and want to say its equivalent to 20 story points because it’s similar to another item that has been estimated at 20 story points. That logic makes sense if the original item has not been re-estimated. If the old item was given an estimate of 10 before the fact and re-estimated to 20 after the fact then it is harder to know if the new item should get a 10 or a 20. With the re-estimation having occurred we’re in the position of saying “Before I start this one I think it’s a 20 because the other one felt like a 20 after I did it.” That’s weaker than “Before I do either of these they seem the same size.”

So, does this mean you should never re-estimated? Absolutely not. There are times when you want to re-estimated. Generally re-estimating is useful when you completely blew it on the original estimate and can see that the mistake was a rare occurrence. (That is, if every estimate is systematically off by half I wouldn’t re-estimate.) Second, you should re-estimate when there has been a change in relative size. For example, the team has discovered that learning AJAX will be about half as hard as they thought. We’d want to fix that because the new knowledge tells us that our relative estimates are off-kilter for the AJAX-heavy stories.