Mike Cohn's Blog

It's Effort, Not Complexity

A client asked me last week “When will my team be done with this project?” This is probably the bazillionth time I’ve been asked that agile project management question in one way or another. I have never once been asked, “How hard will my team have to think to develop this project?” Clients, bosses, customers, and stakeholders care about how long a project will take. They don’t care about how hard we have to think to deliver the project, except to the extent that the need to think hard implies schedule or cost risk. I mention this because I find too many teams who think that story points should be based on the complexity of the user story or feature rather than the effort to develop it. Such teams often re-label “story points” as “complexity points.” I guess that sounds better. More sophisticated, perhaps. But it’s wrong. Story points are not about the complexity of developing a feature; they are about the effort required to develop a feature.

In a class a few years back, I was given a wonderful example of this. Suppose a team consists of a little kid and a brain surgeon. Their product backlog includes two items: lick 1,000 stamps and perform a simple brain surgery–snip and done. These items are chosen to presumably take the same amount of time. If you disagree, simply adjust the number of stamps in the example. Despite their vastly different complexities, the two items should be given the same number of story points–each is expected to take the same amount of time.

This example also points out another aspect of agile estimating, which is that we assume that in general the right person for the job will do the work.We do not assume the little kid will finish school, go to med school, do a seven-year residency and only then begin the brain surgery while we have a skilled surgeon sitting in a cubicle licking stamps. Of course reality intrudes and occasionally the “wrong” person for a job does the job, but that will rarely be as dramatic as in this example.

So, story points are about the effort involved. Feel free to adjust your estiamte of effort based on things like risk and uncertainty, but point-based estimating is about the time the work will take. It’s what our clients, bosses, customers, and stakeholders care about.

About the Author

Mike Cohn is the founder of Mountain Goat Software, a process and project management consultancy that specializes in helping companies adopt and improve their use of Agile processes and techniques. He is the author of User Stories Applied for Agile Software Development, Agile Estimating and Planning, and Succeeding with Agile. Mike is a co-founder of the Agile Alliance. He is also a co-founder and current board member of the Scrum Alliance. He can be reached at mike@mountaingoatsoftware.com

100 Comments:

John Sonmez said…

Hi Mike,

Love your book.  I agree with you mostly.  Except for one little catch.
What happens when something goes wrong?

In your example with licking the stamps, it is easy to fix, if you mess up licking a stamp, you just re-do that one.

If you mess up in the brain surgery, that simple task could become huge.  Or even if you don’t mess up, when you get inside that brain you could find that what you thought was simple was very complicated.

So, I would say that I agree with you that effort points are the amount of effort, not the complexity, but… the only thing I would suggest adding is to say that you have to factor in complexity to a certain degree, because complexity = variability. 

The random unknown doesn’t sprint up too often when licking stamps, but during brain surgery, it is much more likely.

Excellent point though.  I always thought it was strange when people say “effort points” have nothing to do with time.  That is completely bogus.

Tobias Mayer said…

Nice post. Thanks for the clarity. Because we are not estimating in time units many people assume we are not measuring time. Your example helps remove this confusion.

David Bland said…

I tend to view the relative sizing of User Stories in terms of Complexity, Effort & Doubt that roll up into a “size”.

Am I in the minority on this?

Mark Kilby said…

David… you are not in the majority and I’m surprised by this stance from Mike.  I teach teams that points are ALWAYS a measure of complexity, effort, and doubt.  Otherwise, those adopting agile and story points (teams and management) have a more difficult time seeing the benefit of points.  If it’s just another measure of effort, why use them?  If it’s a different measure, then we have to think about them differently and that implies we have to conversations about those differences.  Those conversations are key to an agile adoption.  Once a team has a velocity established, you can start seeing how the points map over a time and you get a better sense of effort.  But if it’s just a measure of effort, I can see where you will have some folks that will try to map points to hours, compare points between teams, and start to use points as just another stick.

Andrej said…

I didn’t expect this. I always thought story points are a way to measure the size of the requirement, not the time it takes to implement the requirement.

If story points are about the time it will take to do the work, velocity doesn’t make sense anymore. Something which takes the team 10 story points in the beginning of a project, might take the team 5 story points at the end of project because of improved experience. if you measure story points in time, the team might be twice as productive, without this being visible in the velocity.

On the other hand, if story points measure size of the requirement, velocity will increase as the team gets more productive.

Tobias Mayer said…

@David (and others) Story points are “size points” and map to time retroactively. If they didn’t how do we make release predictions? They provide empirical data for making predictions. Mike’s point, clearly made I thought is that it isn’t about sizing complexity, but about sizing effort.

@Andrej As a team gets better they get more story points done because each point takes less effort due to improved tools and process. Velocity makes perfect sense: it shows that improvement.  Relative to other stories the sizes remain constant.

Mike Cohn said…

Hi John—
You’re absolutely right to bring up the issue of what happens when something goes wrong. When we estimate the effort involved in something, one of the things to consider is the uncertainty inherent in the user story or feature being estimated. Sometimes something complex may not have a lot of uncertainty around it (“I’ve done this brain surgery a million times; it’s a no-brainer” says the surgeon. “One hour and I’m done.” More often a complex thing will have more uncertainty. And I’m sure we’d see a correlation—the more complex, the more uncertainty.

If the uncertainty is small relative to the total effort, a common approach is to make the estimate a small amount bigger. (“It almost also takes 60 minutes but occasionally 65 so I’ll say 65 to be safe.”) However, if the uncertainty is greater then a more common and appropriate approach is to use two estimates. In the Agile Estimating and Planning book I wrote about two-point estimating and suggest estimating a 50/50 number for the story and a number you are 90% confident in. That book also shows how to work with estimated ranges and turn them into plans.

Mike Cohn said…

Hi Tobias—
Thanks. I’m glad you liked the post. I just love the example of the brain surgery / stamps and it was too good not to use there. I wish I remembered who gave it to me after a class one time.

Thanks also for helping to clarify my points on here. I know you totally get the difference I’m after and appreciate your input.

Mike Cohn said…

Hi David—
Thinking of points as a function of Effort, Complexity and Doubt is fine. In my reply above to John I just combined Complexity and Doubt into one thing: Uncertainty.

Points are a measure of how long it will take (effort). How long it will take can be affected by other things and those can influence our estimate. The key is to remember and understand that it is always about time—no client ever cares how hard we had to think, only how long it took.

Mike Cohn said…

Mark—
Points are just another way of estimating effort. Other things can affect that estimate to the extent they effect effort. But points are about effort. There are many, many advantages to points beyond being sort of chimera of effort and other factors. See Agile Estimating and Planning for many.

One of the most compelling reasons to prefer story points even though it is another way of estimating effort is because it allows individuals of different skills to discuss the relative size of work. My favorite example is of two runners at the start of a trail—one says it will take 5 minutes to run, the other says 10 minutes. Both are right. Those are the amounts of time it will take each to run. There is no time-based argument they can have to settle on the “right number.” However, both can agree that some other trail is twice as long—-one runner will be thinking “Yep, 20 minutes” and the other will be thinking “Yep, 10 minutes.” But both can agree “twice as long.”

John Sonmez said…

@Mike

Thanks Mike, what you say makes sense.  I need to read you Agile Estimating and Planning book.  I read your Agile User Stories book, and it changed my perspective on user stories.

Mike Cohn said…

Andrej—
What size would there be if not effort / time to implement? Is there some other size that anyone would care about than time?

In practice what you’ll find from looking at a long-time agile team is usually a very moderate increase in velocity—they may know they are 3x as fast as when they started but velocity may be up only 20%. And that is often attributable to increased focus on work.

Mike Cohn said…

Hi John—
I’m glad you enjoyed the user stories book and hope you like the estimating one.

Mark Kilby said…

@David… My apologies for the typo… I meant to say “I don’t think you are in the minority”...but I could be wrong.

@Mike… Yes.. I’m very familiar with your Agile Estimating and Planning book since it emerged in early drafts.  My concern is that the tone of your post will cause those still new to agile to assume points = effort = hours.  Yes, effort is a key component of points, but the “uncertainty” is important as well.  Also, it’s important in describing points to emphasize the relative measure.  In your example of the stamps versus brain surgery, a team assuming points as effort would size those stories the same.  But what if the surgeon is out sick during the sprint and the team “committed” to completing all the stories.  I think if uncertainty was a stronger consideration for the points measurement, that team may not size them the same.  That’s where I think your 2 runners example is a better example.

Maybe we just need to agree to disagree on this?

Amanda Varella said…

Mike, I’ve read Agile Estimating and Planning, and after your post, I think I will read again! If the time must be considered in the points estimates, what’s the purpose of using points? It wouln’t be the same of using ideal hours? Ideal days?

Ken Clyne said…

Mike, I’m enjoying your new book.  I can picture you sitting in an open air restaurant in Mexico writing Chapter 1.

Anyway, I agree that sizing is about effort but there are different types of effort.  For example, there is the writing of the code and there is all the thinking (and talking) about the writing of code.  Take the coding of an insertion sort algorithm vs. coding a quicksort algorithm.  Same number of lines of code.  However the quicksort is much more difficult to write, requires more thought and there is a much greater chance that it will need a rewrite.  I find that it helps to talk about complexity and doubt to get people to think of the larger picture and include some of the intangibles in their sizing discussions.

Mike Cohn said…

Hi Amanda—
Let me start with a premise that the goal of estimating is to answer questions such as “When will you be done?” or “How much functionality can we have by the given date?” If that is true, then whatever unit we estimate in and whatever approach we use will need to be about time.  There are many advantages to using an entirely abstract unit such as story points. See, for example, my reply to Mark, on June 21 at 1:07pm above (about the runners).

A good, cheap way to find out more of the advantages is to look at this video on Agile Estimating and Planning that I gave at Google a couple of years ago.

Mike Cohn said…

Hi Mark—
Ah, but points do equal predicted effort (as adjusted by everything we know/don’t know about the item, including its riskiness, uncertainty, etc.) But those are always part even of any effort estimate. If you ask me how long it will take me to go to the market and bring home some fruit, my estimate will be, say, 15 minutes. In coming up with that I thought about how long it will take to get there, the likelihood of traffic problems, the likelihood of a slow checker in the market, etc.

If the surgeon is out sick in the example, the team would be unable to do the work. No example should be stretched too far—-for example, the biggest problem with the example is that it is of work done by a single person in both cases. A typical product backlog item would be worked on by 3-5 people. So the work here (stamp licking, brain surgery) is more tasks rather than user stories (product backlog items). And I don’t advocate using points for tasks on the sprint backlog. So, the example breaks down when pushed too far (as most do).

Mike Cohn said…

Hi Ken—
I’m glad you’re enjoying Succeeding with Agile. I wish I’d written some of it in an open-air restaurant in Mexico!

You are absolutely right that complexity and doubt need to be considered when trying to decide how much effort something will take. But those influence the number of story points assigned only to the extent that they influence the total time the story will take.

Another simple example about why complexity does not matter *except* to the extent it affects the time the story will take: Suppose I give two lists of numbers. One lists is a series of additions between two one-digit numbers, like:
4+3=
2+6=
etc
The other is of one-digit numbers to multiply:
1*8=
3*2=
etc
I have no real knowledge of this but I’m gonna swear that more synapses fire in my brain to do multiplication than for addition. Since every schoolkid is taught addition before multiplication it must be easier / less complex. However, I’m pretty sure I could process the two lists in the same amount of time—-so they would have the same effort-based estimate whether that is in points, seconds, whatever. They’d have the same estimate even though I would think harder on the multiplication.

Now if the list were of numbers I could get wrong (12*13=) I would very likely increase my effort (point) estimate on the multiplication to account for that. But that is an example of increasing the estimate because the expected time to complete the work went up.

Alida said…

Hi Mike,

I frequently encounter team members who try to estimate a story by rolling up the tasks.  That is a common trap people fall into when they think in terms of efforts. 

On the runners example, I can see that they would estimate the second trail is “twice as long” in relation to the first one.  But how do they estimate (and agree on) the first trail if they have to run it together?  I am not sure if the analogy breaks down here, but basically how would they estimate a story they have to do together if their ability are so different?

Mark Kilby said…

@Mike

I agree that points correlate to time… my concern is people will read this post and conclude that a certain point value will map to a certain number of hours and I think that would be a misinterpretation.  So I will agree to disagree that points EQUALS effort, but I will agree that it correlates to effort.

In my experience, I have seen where a certain point value maps to a range of effort hours (for a team) and that range varies from team to team and can change if the team or conditions the team works within changes.

On a different note, I’ve found all 3 of your books most valuable and they are always on my “top recommendations” list.  That’s why I feel so strongly about those new to agile not misinterpreting this article.

Marcello Leonardi said…

Hi to all
I read both books of mike and I’m practicing the estimation suggestion since two years now.
I totally agree with Mike. At the end of the day the stakeholders are interessted in time.
If you develop a product for a customer (project based) then he is interessted in time, because he has to align his marketing actions and at the end he is interessted in the cost of the project, because he has a fix budget which he can spend for building the product.
To get to the cost we need the time that a team will spend on this project until they deliver the product.
To get to this time you can perfectly work with story points. Like discribed in the books of Mike we have two possibilities:
1) You have a team that already worked together and can estimate a complete backlog of a new product in Story Points and you know what this Story Points means in terms of time and at the end cost, because you can extract it from last projects.
2) You have a new team that not worked together, then you have to find out what one Story Point does means in the dimension of time and at the end cost.

I started in the begining of the year a new project (half a year later we are live with release 1.0 - http://www.home.ch/en/ ;-). For this project to get to the time and at the end to a cost estimation I pushed two task to the team, after we had estimated the backlog using story points:

1) Please break down two Stories (first Story was 1 Story Point and second Story was 2 Story Point) into Tasks an estimate the tasks with hours you think you have to spend on.
2) Put together a Sprint 1 Backlog that you think you will fullfill.

For the second part I put together a sheet with all the ressources which I had for the first sprint. So I knew how many time my team is going to spend on the Sprint Backlog.

With this results and with the estimated release 1.0 backlog I could told to the customer that one Story Points will take 2.5 to 3.5 days -> this means xxx to zzz $ cost.
In addition I could tell him that we will take 5 to 6 Sprints to get the backlog done. My customer accept the range, because he understand that we will get much more precise after every Sprint. Then after every Sprint we evaluate our estimates against the facts (results).
So like you see in my example I really believe that estimating with Story Points will work and the translation in time and at the end cost will work.

The most challenging part of my initial process was the relative estimating of User Stories.
If you don’t have much details at the starting of the project (we had in addition to the user stories, some notes in the user stories, some first wireframes and a draft design idea, which by the end of the project changes completely) it is very difficult to estimate in relative.
This was for us the much diffcult part in the estimation process. The mapping from points to time and at the end to cost is more or less mechanical work ;-)

Andrej said…

@Mike
If the surgeon gets some new tooling which allows him to do the operation in half the time he needed before, would you say that the effort is also cut in half, or is the effort still the same, and is the velocity doubled?

Mike Cohn said…

Hi Alida—
You asked how the runners would agree on an estimate if they have to run the trail together. Well, (a) they probably wouldn’t agree because here we are talking more about a task than a user story. This is why I don’t recommend using points for tasks. However, (b) if we think of “run this trail” as more of a user story of perhaps “run three kilometers” we would split that into two tasks—-one runner would run one kilometer, the other would run 2 kilometers and they’d finish at the same time in the end.

Mike Cohn said…

Hi Marcello—
Thanks for sharing the story of your successful use of story points.

Abhilash Pandey said…

Hi Mike
There is no doubt that the story points will retroactively map to certain number of hours but doesn’t directly calling the story points as a measure of effort completely defeat the purpose of story points, which is relative sizing? I always take story points to be based on what you suggested in Agile Estimation and Planning i.e. How Complex it is or How big it is. I tend to add doubt as third factor there. With this post, I am sure lot of people will draw inspiration that if story points are a measure of effort then why not use hours, ideal days etc. and not use story points which will bring the teams to a disadvantage of not getting the benefits of relative sizing. I remember another post by you long back on a similar note where you mentioned that there is no one to one relationship between story points and hours (effort) and it is actually a normal distribution. I believe that was a great explanation of relation between story points and effort and I quote that example to every team I mentor on Scrum. Somehow, I find this post conflicting with the the earlier post about normal distribution (http://blog.mountaingoatsoftware.com/how-do-story-points-relate-to-hours) and a consensus on story points at several other places.


On a side note, How does the team of a little kid and Brain Surgeon estimate the size of two items viz., lick the stamps and doing the brain surgery using planning poker? For licking the stamps, the kid might think of just licking the stamps and gives an estimate of 1 but the Brain Surgeon will think of the hygiene factor and will find a dependency of creating a stamp damper and estimates it as 5. Then for brain surgery the school kid estimates it at 40 and the Brain Surgeon estimates it to be a 3. The surgery can never be a 3, 5, 8….for the school kid, so how do they converge?

Mike Cohn said…

Hi Andrej—
If the surgeon gets a new tool that cuts (pun!) the time in half, then the # of points would be halved as well.

Mike Cohn said…

Hi Abhilash—
Yes, in Agile Estimating and Planning I wrote about the points being about how “big” the story is. I wasn’t as clear in there as I have been in this post. It is all about the amount of time something will take. No client/customer/stakeholder cares about complexity *except* to the extent that complexity affects my estimate of effort.

I’ve already addressed in other responses here that there are many, many reasons why story points as a measure of effort still have tremendous advantages over other things like ideal time.

I don’t see a conflict at all between this post and the one saying that points map to a range of hours rather than an exact value.

In estimating if the team is just the kid and the surgeon, the kid would defer to the surgeon’s expertise on the surgery and perhaps both would collaboratively estimate the stamp licking since both could relate to it.

Prashant Pathak said…

Mike,

Thank you for writing about this. With all due respect, I would differ with you !

I think Story Points make sense when we talk about features and complexities and are trying to guage what the team thinks of that “Story”. Tasks on the other hand can be estimated in number of hours.

I understand when you say that, complexity is not what the clients want and I will have to get the estimated hours from the past to give a good estimate.

In practice, I use 3 measures
1. Story Points—measure of complexity (not in hrs.) useful for the team to discuss the complexity of the issue
2. Estimate in hrs.—could be a total of all the task estimates—good for estimating the time required for future tasks to the client
3. Cumulative Business Value—This is probably the most interesting number we can get to the client

Your thoughts are welcome.

Michael said…

Mike,

Very interesting post (and discussion). I have found your books to be some of the best resources on agile. But this post feels really misleading.

You say that Story Points should not be re-branded as Complexity Points, and used primarily as a measure of complexity. THAT I agree with.

But, the stamps/surgery analogy and some of your statements effectively say that “Story Points = Effort = Time”. That equation assigns Story Points a level of precision that they simply don’t have. It also assigns story points an absolute rather than relative value.  Abhilash and Mark correctly make this point. Story points may be CORRELATED, but are not directly equal.

The simple fact is that when estimating story points, the team (even a seasoned team) will not accurately know the level of effort for a story. What the team will be able to do accurately is rate a story’s effort, complexity and risk (or other factors that are helpful to the team—we use “risk”) relative to other stories they have completed or planned. Effort, complexity and risk are tools/criteria used to relatively evaluate stories.

YES, ultimately it is all about time/effort. But people are terrible at estimating time/effort. SO, we use relative story points that look at relative complexity, effort, risk, doubt, etc among stories. OVER TIME, those will prove out and be useful for planning purposes (and thus to the client).

But the bottom line is that I think it is absolutely useful to think and estimate in terms of complexity, risk or any other criteria which help you estimate relative story points. Consistency is more important than precision.

And in the end, consistency, and velocity (over time) are what will deliver the value clients want to know (when will X be done).

Sure, you are right in the end. But, I find this post misleading from a practical point of view, and likely to lead agile teams astray.

Mike Cohn said…

Hi Prashant—
It’s perfectly fine to disagree with me. However, from your comment I don’t see anything relevant I disagree with. I, too, estimate tasks in hours directly. I do that because it’s often fairly clear with team member will do a task so we don’t need to argue a lot of “it’s 4 hours,” and “no, it’s 8.” For story points, we use an abstract unit and estimate relatively. We can both say “Story A will take twice as long as Story B” meanwhile you are thinking a completely different amount of effort is involved than I am.

Tobias Mayer said…

@Prashant and others who want story points to be a measure of complexity.  My question would be: then what?  So they measure complexity. How is that useful to the business?  How does it help make predictions or do release planning?  The beauty of relative effort estimates is they eventually map directly to time, and to cost.  We can very soon make realistic date and cost estimates based on empirical data. I can’t see how you can do that if you are estimating complexity.

Of course, the ultimate goal is to have all stories approximately the same (small) size, i.e. take approximately the same time to develop, so we don’t need to estimate them. Velocity then simply becomes a count of stories completed per sprint.

Pretty much all software development is complex, singling out certain stories as “more complex” may trigger some good dialog, but is unlikely to be helpful to the business.

Mike Cohn said…

Hi Michael—
I’m glad you liked my books and am sorry you find this post misleading. I actually think the post is turning out far more useful than I’d thought initially in that it is uncovering some strongly held but incorrect understandings of what a story point should measure.

It was actually Mark above who wrote “points = effort = hours”. You are taking me to task for not saying points are correlated with hours. However, I agree with that. Points do not *equal* hours because points are an estimate. That is why I previously blogged that there is no equivalence relationship between points and hours. See http://blog.mountaingoatsoftware.com/how-do-story-points-relate-to-hours 

What I’m saying in this post is that points are an estimate of effort (“how much time will this story take?”). Perhaps the way to say that is that points are a function of effort, risk and uncertainty, or SP = f(E, R, U). (Call one of those complexity if you want; it’s not important.) The idea is that points are an estimate of the effort involved. Risk, uncertainty, complexity, doubt and other things people have mentioned here can be incorporated BUT only to the extent they affect the expected effort. If something is complex but that complexity will not affect the time to implement the feature, that complexity should not affect the estimate—-that was my point with the lists of numbers to be multiplied or added.

You wrote that “it is absolutely useful to think and estimate in terms of complexity, risk or any other criteria which help you estimate relative story points.” I agree but, as in the prior paragraph, only to the extent those things affect the time the story will take. Complexity that doesn’t affect effort is not relevant.

I have no idea why this post will lead agile teams astray. I have been using this example in classes and consulting now for I believe 3 years. It always serves as a point of tremendous clarification. Far more likely to lead teams astray is when someone re-labels story points as “complexity points.” That pushes a team into thinking that complexity is the sole factor in sizing the story.

I witness these continually in my Agile Estimating and Planning classes. In one exercise I ask teams to estimate a list of things that includes, “Wash your boss’s Porsche” and “Read a 10-page academic article about Scrum.” Those are probably somewhat similar in time; maybe one is 2x. When teams in my class start asking “how complex is this?” they will end up saying that washing the car is mindless work and it then gets a much lower number than reading the academic article. Say 1 and 5. Yet both take the same amount of time. (If you disagree alter the size of the paper or car; it won’t matter.) When the team runs a sprint of doing those activities they will find that they take the same amount of time and therefore should have been given the same estimate.

Jean Tabaka said…

Hey all,

Is anyone else reading these comments thinking about systems. Not to pile on, but, to pile on :-) complicated system is different than a complex system. So complexity counts for something at this very fundamental level. To round out this pile, there are also simple systems and on the other end of the spectrum chaotic systems. How are you going to declare points? Well, in a simple system, it is pretty darn easy. A straight forward solution awaits you, easy to articulate, easy to deliver. In complicated system, we’ve got a solution; it’s just mighty complicated. Still, we know the solution to a problem. In a complex system, the solution set just isn’t that easy to declare; it is beyond complicated. It has difficulties that bring ambiguity. Is complexity then the same thing as doubt? I don’t care. I just no that my problem set is beyond complicated, so I have a lot more to think about before I respond to how I can deliver a solution. In chaos, nothing on this earth will help you provide point guidance IMHO. Complexity represents that differentiator between what is merely complicated and what has the ambiguous difficulties inherent in complexity.

I learned something of this from a presentation given by Jim Sutton at the Leans Systems and Software Conference and the work done by David Snowden on Cynefin. I may have it wrong. Like points, I probably do have it wrong.

Michael said…

Mike,

Thanks for the great response, and you’ve convinced me we actually do agree.

On SP = effort = time, I actually understood that directly from the post where you say: “the two items should be given the same number of story points–each will take the same amount of time.”  Although your clarifications now make it clear, that line seems like it is saying that SP = time directly.

The thing that is confusing here is that TIME is only known in retrospect. So, I find your responses much clearer than the original post.

The other thing I find confusing in both the original analogy and your car wash / reading example is this: I think they are confusing complexity and difficulty.

I (and my agile team) generally understand complexity to mean “number of moving parts”. In the car wash / reading example, both are equal complexity. One may be more mentally challenging (difficult), but both are straight forward, can be done by one person, etc.

The stamps / surgery analogy feels disingenuous. The brain surgery is inherently more complex (if we are being realistic). It requires a complex lab setup, nurses, anesthesiologist, and maybe a consulting physician by video conference. So my point is this: If we look at this and say “the stamps will take about 2 hours and the surgery will take about 2 hours”, and give both the same story points (i.e., 5), in the long run we’ll get burned. The greater complexity of the surgery will tend to mean over time, this type of operation will average maybe 4 hours (one time the anesthesiologist shows up late, the next time the video conference doesn’t work right and so on.)

I think you nailed it for me when you say, “I agree but, as in the prior paragraph, only to the extent those things affect the time the story will take.”  My whole point is that people are bad at estimating time directly. So, we’ve found it most useful to compare the relative effort, complexity, risk, etc. of stories. In retrospect, that method helps us estimate pretty accurately so that indeed we are accurate about how much time it takes.


Tobias:  I am definitely not in the camp of saying we should “estimate complexity”. My only point is that the post feels like it is saying “lets not care about complexity, lets just estimate effort/hours”. I totally agree with Mike and you that the only point of the whole exercise is to figure out how long something will take. I am just saying we find the best way to accurately estimate effort is to compare complexity, risk, doubt, etc. They are just tools toward the same goal. 

In any case: Mike, thanks so much for your time and this great discussion. Definitely provoked some deep thinking.  Hope to run into you around Boulder some time!

John Mc said…

Mike,

I’m Scrum mastering a couple of teams in a shop that’s newish to agile estimating an planning.  These types of questions are haunting us.  Let me set up a scenario and see what your thoughts are.

Let’s say that we have a Scrum team whose product owner wants them to wash 7000 Porsches.  That’s too many Porsches to wash inside of one one-week sprint, so they work with the product owner to split the story into 700 stories with 10 Porsches each.  The team has never washed any cars before, but they understand the process and are confident that they can knock out 10 stories (100 Porsches) per sprint, so they assign 5 story points to each story and accept the top 10 most valuable stories into their first sprint, for a planned velocity of 50 story points.

Now, as the team goes through their sprints, they are going to get better at washing Porsches.  What I understand you to be saying is that they should re-estimate their user stories as they get better at washing cars, but their velocity in story points should remain constant at 50 (or very close to it).  What makes more sense to me is that you let the estimated stories stand but let the velocity rise as the team gets better at washing cars.  Likewise, if the water company cuts water pressure to the car wash site, it will take them longer to fill their buckets and thus slow them down.  Do they need to re-do their estimates, or just let it wash out in their velocity?

This is obviously an over-simplified example, but in cases where software teams are doing relatively repetitive tasks (like, say, convert a web app’s UI technology from technology A to technology B working page by page), the first one they do will take more time than the second, if the second and the third have a similar complexity (oops, sorry, couldn’t avoid that word there).  This is similarly complicated by the fact that staff turns over, thus “re-setting” the learning curve.  Would the necessitate a full re-estimation of the backlog?

We are really wrestling with this stuff when we do our sprint planning, so I appreciate the blog post and everyone’s great comments.

Thanks,
John

Mike Cohn said…

Hi Michael—
I’m glad to read we agree after all. I have edited the original post (something I usually avoid) to read that “each is expected to take the same amount of time” (about the surgery & stamps) rather than each “will.” You are right of course that the actual time is known only in hindsight.

As for the lab set up, etc for the surgery. Sure, there are many others involved but consider this example: Suppose a development team decides to put a number of points on a story of “hire a new programmer.” In considering that, they should only include the effort that will be expended by team members; they do not consider the effort of the non-team-member Human Resources staff. If we included HR effort in the estimate, we’d put too many points on it, overstate velocity and lead to later problems when the team’s velocity wasn’t benefited by the effort of HR members. So, in the surgery example, the surgeon was the only team member involved in that work.

I’m glad you’ve enjoyed the discussion on this one. I had no idea it would invoke such differing and strongly held opinions.

Mike Cohn said…

Hi John Mc—
Good question. In the case you described I would not re-estimate. The issue isn’t “Hmm, we’ve gotten better, let’s re-estimate. That story used to be a 5, not I want to call it a 4.9.” (Obviously an exaggeration.) In the case you describe the velocity would increase a bit as the team got better.

Now, if you have 10-year-old backlog items you may want to re-estimate those to reflect a team’s improved productivity but my first recommendation there would be not have a backlog anywhere near that long. I typically recommend a backlog of no more than 100-150 items which would likely be 10-20 sprints, and then perhaps no more than 40 weeks. And that’s a maximum. I’d like the product backlog to reflect no more than about 3 months plus some big huge epics giving a wild view of what’s after that.

To your other question, I can’t recall a time where we’ve re-estimated the entire product backlog. (Except perhaps being 10 items into the initial estimates and deciding to re-baseline because perhaps our numbers were too tightly grouped.) But, for more on re-estimating including when I do recommend it, see http://blog.mountaingoatsoftware.com/to-re-estimate-or-not-that-is-the-question

Michael said…

Mike,

Wow, I’m honored! I think that change is helpful.

Your last answer (about the surgery support staff) made me think of a related question: Do you consider external forces in terms of how they affect effort? 

For example: suppose you are working with an external client/PO. You have a new feature that you know has to get input from some department, AND you are going to need some data sets from the sales group. You know from previous experience that this kind of story requires extra effort—communicating, verifying and processing all these externals.

Ideally, you have no dependencies, and to a large extent your ScrumMaster should be removing these impediments before they affect workflow. But still, you need to estimate weeks in advance.

Do you increase your estimate in this case? Tell the client the story isn’t ready for estimation until you have all the data and input you need?

Thanks

Mike DePaoli said…

Great Post Mike. It certainly go some synapses firing :-)

I completely agree that at the end of the day all the custom / stakeholders care about his how long it is going to take because this largely determines cost and also dependencies for the broader organization.

One broader point about relative point estimation that I don’t think we should lose sight of.  Regardless of what unit one ultimately wants to arrive at, iT is valuable because of how the human intuitive algorithm works.  We are much more capable of processing patterns of information and bring to bear more information when doing exercises like relative point estimation.

In software development it provides an excellent way for the work experience patterns of a cross-functional team to be intergrated for the purposes of estimation. 

So, bottom line, relative point estimation is useful to us as humans just because of the way our brains work. You mind as well put the full power of your braint to work :-) 

I only mention this because some post on this thread smacked of the thinking ‘so what, story point estimation isn’t useful because it’s the same thing as estimating in hours?’.

Mike Cohn said…

Hi Michael—
I’m glad that change helps us agree on this.

As for the situation where part of doing a product backlog item requires the team to do something like nag some other team (“remember, we need such-and-such by Monday”) and if that nagging or other effort (e.g., going to meetings to persuade them to do it at all) then I would include that effort in the estimate of the backlog item. You didn’t ask, but also when sprint planning came along we’ll create tasks like “nag IT group to get new server configured, 2 hours.”

But, in cases when you don’t need to start the story yet and it can be safely deferred another sprint, I might very well tell the product owner that we can’t estimate the work until some other group finishes their work or believably commits a date to us. I also tend to take that approach when the other group has a poor record of meeting their own commitments to us.

Mike Cohn said…

Hi Mike—
Yes, you are absolutely right that there are some tremendous advantages to story points because of how it forces us to think about the estimates. Perhaps I need to blog on that pretty soon, too. Thanks.

Russell said…

As Mike describes in his book “a fundamental and common challenge in many organizations adopting and adapting agile and lean product development is estimates and commitments are considered equivalent; in other words, one assumes that the commitment to cost, schedule, scope and quality has to be the same as the estimate; it doesn’t.”

What should happen is we derive an estimate and, on the basis of that estimate, make a commitment to develop and deliver a story or defined functionality at a specific level of quality, by a certain point-in-time and at a specific cost.

Estimates using a relative unit-of-measure, for cost, schedule, scope and quality are derived from a prioritized and sized Product Backlog during Release Planning.

When you are estimating story size, at the Product Backlog level, a story should contain just enough detail for the team to be able to estimate its relative size to other stories based on certain criteria such as presented in this blog.

My initial set of criteria starts with the following but is adjusted based on the reality of the situation:
1. Complexity
2. Uncertainty
3. Knowledge and experience with domain
4. Knowledge and experience with technology

Commitments per story using a unit-of-measure in hours for what the solution involves or what it will take to deliver the story are derived during each Iteration/Sprint Planning session.

When committing to getting stories done during Sprint Planning, a story should contain enough detail for the team to be able to determine what the solution involves and extrapolate what it will take them to deliver the story in hours based on having tasked out the level-of-effort required to get the stories done.

Complexity points better off said…

This philosophy works within a project, however if you have a group of 10 projects working on the same product, you need to have a common unit and here using story points/hours as a common unit has many short comings.

E.g. if you are trying to buy a t-shirt, will you only look at the price of the t-shirt? I think the first thing that we need to look at it size of the t-shirt if it fits or not.

I would like to given an example of same project using story point estimation and using complexity point estimation

A project has a backlog consisting of 10 stories to be taken in a sprint. Supposing that the team has committed to all the 10 stories for a sprint. Let’s look at the workflow using both the scenarios

Project using story point(hours) estimation
• The team is sitting for look-ahead and getting understanding from the product owner on the stories in the backlog
• The team is using the time after the sprint planning and understanding the story better
• The team is sitting for sprint planning meeting and estimating hours that the story will take based on their understanding
• The team discusses the discrepancies in the estimations to come to censuses. Here there are no in-depth discussions on implementation and team may uncover some implementation issues in the sprint execution
• Requirement analysis is not done, requirements are only understood at a high level
• If the look-ahead’s were effective or not cannot be judged
• The variance range for story point estimation has been huge (beyond +/-100) , this has been analyzed based on the data for around 6-7 sprint from all 100 projects, which means that the next time we estimate the likely variance falls between the huge range.
• The dashboards representing velocity for all projects is not apple-to-apple comparison. Hence the projects with same story point velocity have different quantum of work being done.
• If the story points are efforts, it does not help business in predicting how much we can complete for the next release

Project using complexity point estimation
• The team is sitting for look-ahead and getting understanding from the product owner on the stories in the backlog
• The team is using the time after the sprint planning and understanding the story better
• The team is sitting for sprint planning meeting and trying to assign the values to complexity factors
• The team discusses each requirement technically in depth while assigning the complexity points so that the implementation challenges and clarifications required can be established
• Requirements analysis is incorporated into the discussions during sprint planning
• If in the sprint planning, we are unable to assign the complexity points, that means that requirements are still not clear and technical people who are going to work on the story still lack some understanding
• The complexity point equation statistically can give around 80 % of confidence that the estimates will be met, which gives us more clarity and predictability
• The complexity points per project is an apple-to-apple comparison
• Complexity points start building its history in projects, and business can get better predictability from the past sprint data

All that we are doing is finding the answer for business “when can we deliver this?” , however complexity points can answer that in a better way in the long run.

Complexity points in turn relate to efforts at the end to answer that.

So if we are building a house of a fixed size, a builder who has a better idea of how long it takes per unit of size for the house is able to predict better that the builder who has just build the house without calculating how long it takes per unit of size.

Mike Cohn said…

Dear “Complexity Points Better Off”—
For information on how to create a common baseline for multiple teams, see http://blog.mountaingoatsoftware.com/establishing-a-common-baseline-for-story-points

I’m not totally convinced, though, that you are estimating complexity. Your last paragraph fits exactly with what I’ve been trying to say throughout this post. Story points are an estimate of how long something will take. Complexity factors into that but is not the SOLE determinant of the number of story points. When someone re-labels “story points” to “complexity points” they turn the measure into a measure of one thing—complexity. That’s what’s wrong. It is more than complexity.

Complexity points better off said…

Mike, Thanks for your reply.

But we are also considering the environmental factors after we arrive at complexity points. So one complexity point can have different efforts when it comes to team composition, domain knowledge, technical know-how etc….which are typically environemntal factors. We are statistically evalauating an equation which gives co-relation between complexity and efforts

I have gone through your blog as well, 44 people are estiamting 12 stories in a meeting. You can call a meeting with same number of people after a one or two months, the estimates would differ. But you do the same exercise with same set of people using complexity points, you get the same answer even after a year.

So size of the building remains same even after few years but efforts to build the same building changes after few years/months/days ....

Vivek said…

Interesting!!!

I have gone though the valuable inputs on this blog. Very interesting.

My points:-

1) If we are using effort to enable planning & tracking, what do we do with Story Points? Why story points are required?
2) If complexity is helping to arriving at better effort estimate, why do we resist using Complexity Point as one of the size measure?
3) If the team skills are changing & team members are changing, sprint capacity (man Hrs) are changing sprint by sprint (which I believe is realistic scenario for countries managing outsourcing - @ 20% attrition rate) how do I know the team/project performance?

Russell said…

I echo Mike’s point that “complexity” should be one factor of the relative “unit-of-measure” one uses when estimating; keeping in mind the distinction between and estimating and committing.

For example the following factors:
1. Complexity
2. Uncertainty
3. Lack of knowledge and experience with the domain
4. Lack of knowledge and experience with the technology

It is in the perview of you, your team and your organization to decide what factors you are going to use to derive your points per story using a relative unit-of-measure; keeping in mind complexity is just one factor.

Mike Cohn said…

Hi “Complexity Points Better Off”—
I don’t disagree at all that there is a statistical correlation between complexity and effort. However complexity is not the only factor determining effort.

Mike Cohn said…

Thanks, Russell.

Mike Cohn said…

Hi Vivek—
Responding to your 3 points above:
1) I will need to write another blog post sometime about why I think story points are far better than estimating directly in ideal time or any time-based estimate. As noted in some of my replies above, some reasons are given in the Agile Estimating and Planning book but I’ve come across even more reasons to prefer story points.
2) Complexity is an element in determining total effort. I don’t know why though you’d want to separate out a different unit called “Complexity Points.” Stick with one unit called Story Points (or something similarly innocuous) and use it as effort as adjust by things like complexity, uncertainty, risk, etc…
3) See my other blog post on what to do when team size changes. You can track that and make predictions if you track the historical data on relative changes in velocity when teams change size.

Russell said…

There are two primary reasons we estimate story size using a relative unit-of-measure at the “Product Backlog” level:
* the teams understanding of each story

* a high-level cost, schedule and common understanding of scope

As for playing planning poker, practicle experience tells me we do not do this for the entire Product Backlog for a myriad of pragmatic reasons.

What works for me is to have the team size each story as XXL, XL, L, M and S. Then the stories that are medium play planning poker for each of these. I assign XXL= 55 points, XL=34 point, L=21 points, M=5, 8, or 13 (we play planning poker to determine if the story is an 5, 8, or 13) and S=2.

When it is time to plan our iteration this is when we use a more precise unit-of-measure as the team commits to what it will take to get each story done.

Rahil Patel said…

Like many folks who have responded, I was a bit baffled at first after reading your post. 
In one of your responses, I’m glad that you wrote down SP = f(E, R, U) - it does bring things a lot closer to what I undertand points to be.
You also mention in your post right above that complexity is part of effort, this also makes things a whole lot clearer.

So then would effort be a funtion of time and complexity rather than just one?
Also, how does one deal with temporary learning curves that go away after a while from a points standpoint?

One more question - do you think of Story points as a measure of “Doneness”, rather than just a measure of effort?  In other words, SP is a results oriented metric, because you don’t earn SPs till a story closes, ideally delivering business value with the closure.  A team could spend significant effort toiling away at a story, but points are not earned till all the acceptance criteria are met.  Thoughts?

Daniel K said…

Hi Mike

“One of the most compelling reasons to prefer story points even though it is another way of estimating effort is because it allows individuals of different skills to discuss the relative size of work. My favorite example is of two runners at the start of a trail–one says it will take 5 minutes to run, the other says 10 minutes. Both are right. Those are the amounts of time it will take each to run. There is no time-based argument they can have to settle on the “right number.” However, both can agree that some other trail is twice as long—one runner will be thinking “Yep, 20 minutes” and the other will be thinking “Yep, 10 minutes.” But both can agree “twice as long.””

I think this reasoning breaks badly if applied to your brain surgery/stamp licking example. The kid can presumably lick 1000 stamps in about an hour, while it will take him ~10 years to learn brain surgery. So for him the latter task takes ~10^6 times longer to complete. For the brain surgeon the tasks take approximately the same time to complete. Clearly these tasks are not comparable in the same way the running tasks are.

The “comparability” of software development tasks also varies (although seldom past any of these two extremes). Comparing the size of tasks involve assumptions, for example about who will do it. Often the most direct route to compare the size of two tasks is to make the necessary assumptions and reduce them both to expected time +- expected deviation. But in that case we have to do two absolute estimates to arrive at the relative one, so what does relative estimation buy us?

Mike Cohn said…

Daniel—
One of the points I made in a comment here somewhere was that the example breaks down here because it is of tasks rather than of something more close to a user story. A typical user story or product backlog item will involve work from 3 or more team members. The kid/surgeon example here is simplified to be of tasks. For a user story the impact you describe is mitigated by having 3+ people involved in the work. (That is, it’s unlikely that the 3 working on one story are all 10x better than the 3 working on a different one.) So individual performance differences wash out (some, not entirely).

Additionally, I’ve made the point above that we should assume that the right person will do the work. That doesn’t mean that the programmer who is 5% better than the other programmers needs to do all coding, but in the kid/surgeon case, I suspect that sometime during the kid’s 10 years of med school, the surgeon will find the necessary half hour to do the surgery, making the kid’s education all for naught ;

Mike Cohn said…

Hi Rahil—
Yes, story points are an estimate of how long we think something will take. How long we think it will take is a function of the effort involved, how uncertain we are about our estimate of that effort, how much risk is there, etc. Some of those are affected by complexity. A highly complex task is likely to have more risk and likely to have a higher number of story points. However, the complexity itself does not add to the number of story points. It should only do so if that complexity affects or might affect the amount of time the work will take.

As for learning curves, the team should always estimate given their current state of knowledge, tools, etc.

And yes, points are earned only when a story is done. No partial credit is given.

Baartz said…

Nice article.  Any thoughts on the relationship between backlog order and effort?  It appears to be a recurring theme that the completion of backlog A reduces the presumed effort in backlog B.  Since effort estimate are captured early, and prioritization can occur late, how does one account for this?

Mike Cohn said…

Hi Robz—
Nice summary of our discussion.

Mike Cohn said…

Hi Baartz—
I’ll have to create a separate blog posting on that. But it’s a good idea so I’ll see how soon I can get to it. I’m coming off a two-week holiday (en route home right now) so I’m quite backed up on things currently. I’ve added this topic to the list though.

Marek Blotny said…

Hi Mike, concept of story points was always slightly confusing for me but as I can see not only for me! I have publish a post on my blog about story points and what they are for me inspired by your post.

I have proposed following mapping between SP and time:

1 SP = less then 4h
2 SP = 4 hours to 1 day
3 SP = 1 to 2 days
5 SP = 2 to 4 days
8 SP = 4 to 8 days

Of course each team can re-define this mapping for their own purpose but do you think that it can be used as a starting point?

Mike Cohn said…

Hi Marek—
If you do a mapping like this you completely defeat the purpose of story points, which is to allow individuals of different skills to talk about the work. If you are twice as good a programmer as I am, you and I will never come to an agreement using your scheme. You could do some piece of work in 4 hours but it would take me 8.  You and I could agree though that this piece is work is half the size of some other piece of work. For more on mapping points to hours see my earlier blog posting at http://blog.mountaingoatsoftware.com/how-do-story-points-relate-to-hours

Marek Blotny said…

Hi Mike, thanks for you reply, I see your point but I still have my doubts:

1) everything slightly depend on definition of ‘done’ for user stories. Usually coding is just a part of all activities required to finish story. Therefore I don’t think such a big differences between individuals are very likely.

2) my idea is to find a reference story and within a team agree that given story is worth lets say 2 points. And from that point compare (in terms of size) other stories to the reference one. Of course one may say that “individuals of different skills” are unable to agree in terms of time. Important is to understand that we should talk about the team rather then individuals. It’s hard to assume that story X will be implemented by individual Y. Estimates should take into account average skill within the team. Therefore I think that there are 2 options to agree while estimating: simply take the highest estimate or ask these individuals to estimate how much time they need to finish given story working *together* on it.

Mike Cohn said…

Hi Marek—
You’ll see in the dozens of comments from me on this post that I agree with your two points: the things we estimate in story points are product backlog items and these are typically done by multiple people so that individual performance differences (which are often much larger than 2-1) are reduced somewhat.  Second, I agree with finding a “reference user story” and calling it a 2 and then estimating relatively from there. That can’t happen though if individuals are each mapping points to hours such as 1 point = 4 hours. Instead two points is whatever that one story is and then 4 points is something we each think will take twice as long.

Dan Bergh Johnsson said…

Hi Mike

I like the example of “licking thousand stamps” as it really clarifies the distinction between complexity and effort - and it is no doubt the latter that is useful to estimate and track, as it enables us prediction; the former is not useful in the same way.

However, the “licking thousand stamps” is also a good example to show how complexity and effort are related. And how, in software development, the former drives the latter.

When licking ten, hundred or thousand stamps, it is the bulk of the work that drives the effort. But if you are to lick a million stamps, then you will probably build a stamp-licking machine. And then it is the complexity that is the major driver of effort.

In my experience, software development efforts are very often about handling complexity - essential or accidental. Thus it is more akin to the machine-building that to the manual stamp-licking. Of course, complexity that does not add to our effort is uninteresting (as you have pointed out).

So, IMHE we want to estimate effort, but complexity is the major driver behind that effort.

I elaborated this a little at http://bit.ly/cCQLzr.

How does this compare to your experience? What other drivers of effort have you seen in the wild?

Mike Cohn said…

Hi Dan—
Good points. And yes, I completely agree that complexity is a major driver of effort in our industry. In one of the many comments here I listed others such as uncertainty, risk, and sheer volume of the work (1000 vs 10 stamps).

JOOLz said…

Hi Mike

I think there is a common problem here with story points which is that in my experience there are always some who want to define exactly what a story point actually is. In doing so they try to equate it to some other measurement that they are familiar with such as time.  This is understandable.

But we mustn’t forget how story points assist the team by remaining an abstract measurement of relative size.  It alleviates the necessity for long discussions about how the story is to be delivered, how complex it is and of how much time it will take to implement i.e. too much detailed discussion.  Discussions to this level of detail generally result in differing views within the team about the amount of effort required and therefore a protracted, labour intensive and perhaps tedious planning meeting.

Keeping story points as an abstract but relative measure of effort means that teams get to estimate a product’s backlog more quickly.  As soon as you have your first story estimated it becomes easier for the team to estimate the subsequent stories.  In my experience, even if the estimate for a story differs wildly between a tester and a developer, following a brief discussion as to why they differ so greatly, the team will eventually come to a consensus of how much effort is required.  This resulting from a discussion where the amount of time it will take is never mentioned.

Also, after a number of sprints it’s my experience that the team’s ability to estimate the size of stories settles down i.e. the individual estimates from each team member start hitting the same values from the first reveal of the planning poker cards for a given story.

I guess what I’m getting at is that experience speaks volumes.  And experience shows that story points do work even though it might appear they can’t or shouldn’t.  I would encourage anyone sceptical about using story points to try it first and give it a fair go too.  Tried over 4 or 5 sprints, I’m sure they will begin to see how the team will come to value it, seeing how it has improved their ability to estimate and plan a product delivery more quickly and effectively and with more accuracy.

Mike Cohn said…

Thanks for your comments, JOOLz. My experience matches yours. I openly acknowledge how uncomfortable story points can seem before trying them. Once used to the idea, they work well for most teams, though.

John Esser said…

Story points, long understood in the Agile community, are a measure of SIZE not effort, nor complexity, nor duration.  More specifically, story points are an expression of relative size. One major problem is that effort is generally thought of in terms of time units (e.g. man-days) so this causes a lot of confusion.  Teams need to be broken out of this way of thinking to use story points effectively. I use the analogy of a a house—a 1000 sq. ft. home is just that.  How long does it take to build it? It depends on who is doing the work, their experience level, and other factors. In software we look at things like how much code we’ll have to right, how many pages we’ll have to change, how many tests we need to create, and how much code we’ll need to refactor.  The key is having stable, reference stories from which to estimate.

In the house example, if I have junior carpenters the elapsed time may be 6 months, but if I have master framers maybe it takes 4.  Because I have used size (rather than effort, complexity) I can see the effect master works had on the elapsed time.

The advice of Jeff Sutherland is very salient on the issue of story points.  To quote: “relating hours to story points causes a huge impediment for the team. If the team in generating continuous process improvement, the hours per story point will be continually decreasing. Assuming a number of hours per story point makes it impossible for the team to show they have improved and fixes in their mind that there is a reasonable number of hours per story point.”

Also from Jeff, “My venture companies typically have two stable references stories for 1,2,3,5,8,13,20 points. It is easy to see where stories fit. The Product Owner can help keep people honest on estimation…The problem for our teams is not inflating estimates. As they go faster they estimate similar stories to have fewer points unless they are very careful to have stable reference stories. For some of them we know they are going twice as fast and velocity has not change. This is an impediment because they cannot tell when a process improvement has helped.”

Oaz said…

The example “lick 1000 stamps” is misleading because you never lick 1000 stamps in software development. You often lick 1 stamp. Occasionally you could lick a couple of stamps. After that you implement a StampLicker that does the job for all the remaining stamps with zero effort for the members of the team.

Though, the complexity of the StampLicker implementation has to be considered.

John Clifford said…

Hello Mike,

I have to question the premise of this article, because equating effort to time is, in my opinion, a huge mistake. They are related, but not equivalent, and equating the two violates a basic rule of estimation: estimate size, derive duration. What ties effort and duration together is velocity, or the rate at which the effort is exerted to implement the desired functionality. Yes, managers want to know duration rather than effort, but we are not helping them or ourselves by confusing or equating the two.

Instead of equating effort to time, it should be considered as analogous to distance. For example, Mike, and Tobias, and I, and everyone else reading this thread can generally agree on what a mile is, and if one distance is twice as far as another, or half as far. However, we will disagree on how long it takes us to cover that mile because that is dependent on our individual abilities. I’ve met Tobias, and I think he could run a mile faster than I currently can (wait until after arthroscopy!). Similarly, a world-class miler could would accomplish the same amount of work more quickly than either of us. What’s the difference? Velocity.

Looking at the licking-stamps-versus-brain-surgery example, I would disagree that they should have the same measurement in story points. Brain surgery requires far more skill, and I think both a brain surgeon and a paper boy would agree that the effort required to lick a thousand stamps is far less than the effort required for even simple brain surgery… especially considering the differing Acceptance Criteria. That the brain surgeon can perform the surgery in the same amount of time as the teenager can lick and stick 1,000 stamps is due to the fact that his velocity for brain surgery far exceeds the teenager’s velocity for brain surgery even if their velocity for stamp licking is similar. Similarly, two different Scrum teams may agree on the same story point estimates for two different stories, but one team may take far longer to implement those stories due to their lower velocity (ability to turn ideas into potentially shippable increments of software functionality). Of course, how people do what they do also affects velocity. I can write the same functionality in assembler as I can in C#, but my velocity will be much higher in the high-level language.

Accordingly, I don’t really care about duration when I’m estimating because time-based estimates are closely tied to the person doing the estimate; I may answer “10 minutes” when asked how long it would take me to cover a mile while Carl Lewis may reply “5 minutes” and we are both right (if we have to run that mile in the time given) and both wrong (if we are estimating for the other person).

In short, story points are best used as a relative measure of effort/complexity/size instead of duration, and then a valid duration can be derived by factoring in the velocity of the people who will actually do the work.

tarandeep said…

Mike, i read your agile estimating and planning book and it says, story points estimate relative size and then duration is derived from it by using velocity !! i am confused when i read this post of yours ! is this not contradictory to finding relative size and deriving duration. Is this not same as traditional estimating where we derive duration directly ?

Mike Cohn said…

Hi Taran—
I don’t find this post contradictory to my Agile Estimating and Planning book at all. Consider this quote from Chapter 4 of that book: “There is no set formula for defining the size of a story. Rather, a story-point estimate is an amalgamation of the amount of effort involved in developing the feature, the complexity of developing it, the risk inherent in it, and so on.”

I wrote this post because too many people were misunderstanding relative estimating in story points. They were interpreting it to be that we estimate solely the complexity of a user story. That’s wrong. Complexity is one factor and often perhaps a key one in software development. The point here is that we are still estimating how long something will take—-doing this will take twice as long as doing that. But we are doing that in relative rather than absolute units.

Mike Cohn said…

John—
I believe we are saying the same thing. See all my replies in which I comment that what we are estimating is effort which is a function of how long something will take as adjusted by all sorts of things such as complexity, uncertainty, risk, etc. So I am not equating the two.

The point of this post was that story points are not “complexity points” as I had been hearing people rename them. Measuring nothing but complexity does not help us answer the questions asked by clients, customers, or bosses.

Simon Cockayne said…

Hi Mike et al,

What a super thread!

Ok, I am clear on the point of your post that “story points are not complexity points”.

I think what is slightly confusing though, is that on one hand you say “a story-point estimate is an amalgamation of the amount of effort involved in developing the feature, the complexity of developing it, the risk inherent in it, and so on.”, but in at least one place in the book you say, I think, a story point estimate is, paraphrased “a measure of pure size”.

Do you feel those two statements are in conflict or are you saying that Size = f(Effort, Complexity, Risk,...)?

Moreover, do you equate Effort to Duration?

Cheers,

Simon

Mike Cohn said…

Hi Simon—
I’m glad you get that story points are not “complexity points.”

I am saying Story Points = f(Effort, Complexity, Risk, etc.)

Effort is not equal to duration. I stick with the standard PMBOK definitions of those rather than try to redefine things.

Jose said…

Imagine that we have a team of painters and their job is to paint walls. The team see pictures of the walls and they estimate the wall area to be painted. There are small walls, medium walls and some really big walls. The team estimate each area using relative points. They start painting the walls and after some time, they are able to calculate how many points they can do on each sprint, the velocity. Now the customer can calculate how long the project will take, using the velocity and the product backlog.

We are able to predict the duration of the project without using any “time” in our estimation, because the duration is based on the amount of work done per sprint. No need to use time in our estimation.

I always thought this was how story point estimation were done.

Mike Cohn said…

Hi Jose—
This is how story point estimation is done. Your example is a great one. It’s one I’ve been using in classes for a couple of years now. But let’s think about what your hypothetical team thought about when creating the estimates.

Let’s keep it simple (at first): no furniture in the room, no crown molding, etc. In your example the team had some really big walls and some small walls. They sure didn’t estimate based solely on complexity. If they did both walls would have the same number.

Now let’s add some windows and window sills on the small wall. Add enough that our hypothetical team looks at a big wall and this small-wall-with-windows and wants to put the same number of story points on the two walls. If story points were merely “complexity points” they would have put a higher number on the small wall (since it’s more involved to paint).

What’s going on here is that the team is estimating in the only thing that matters: how long the work will take as adjusted by factors like uncertainty, complexity, etc. This is why the large wall and the small wall with windows get the same value. And intuitively that makes sense because these walls will take the same amount of time to paint.

Some at this point will say “Then why use points at all?” I’ll share just a couple of reasons here because that topic deserves its one whole post someday (and it’s also covered in the Agile Estimating and Planning book):

<ol>
<li>Suppose the painting will be done by a professional house painter and his 8-year-old kid. They can’t have a discussion directly around how long it will take. Dad says “This room will take me 4 hours” and the kid says, “No way, more like 4 days.” And both are right. Then can discuss the room relatively: “Hey junior: This room is smaller but will require more taping around windows so I think it will take just as long as the big, easy room we looked at earlier.” So points let people with different skill sets discuss the estimate meaningfully.
</li>

<li>Suppose that you’ve been making these wall painting estimates from photos I’m showing you of the walls. There’s nothing else in each photo but the wall. You come up with all your story point estimates and then I spring on you that I’ve been showing you pictures of Bill Gates’ house. The wall you thought was 15’ by 10’ is really 450’ x 300’. If that were true of all the photos I showed you, your point estimates would still be valid because all the estimates were relative. Your velocity would be much lower than you’d anticipated, of course, but the estimates would still be valid. If those estimates had been done in a non-relative manner such as perhaps with ideal time, you would have to redo the estimates.
</li>
</ol>

So, Jose, your example was a great one and your understanding of story points is perfect. What’s happening though is that your mythical painting team is thinking about how long each wall will take when they estimate it.

Jose said…

Hi Mike, thanks!

These two examples are wonderful!! I can really understand what you are saying about using time in estimations and the examples show how to think in relative manner.

But I should insist that maybe we can estimate without using time and achieve the same results for the customers.

Let’s go back to the painting team example. You said that the painting team were thinking about how long each wall will take when they estimated it. But that’s not really true. They were told not think on how *long* would take to paint the wall, but to think only about the wall area: they must *measure the area* in relative points, the size of the wall. Well, after some sprints they know how much points they can do on each sprint, so now they are able to predict the duration of the project. The team wouldn’t have to think on how long it takes to paint a wall until they actually start doing it.

Another example. Forget about story points, let’s measure the wall in square meters. Imagine that the team have measured the *area* (and only the area) of all the walls in the backlog. They start painting the walls and after 2 weeks they found they have done 3 walls: one with 100 square meters, the second with 50 m2, and the third one also with 50 m2. So they now know that they can paint 200 square meters per sprint. Supose that the backlog has 2000 m2, and so I can predict the duration of the entire product: 10 sprints.

My whole point is that we can estimate thing’s attributes that are not time or time-related and even so we are able to predict duration using these estimates, if the attributes are directly proportional to time. Wouldn’t that be true?

I think that wall area is directly proportional to time to paint, so I can estimate areas and derive duration using area painted per sprint. Would that be true for software complexity? Is software complexity directly proportional to time?

Mike Cohn said…

Hi Jose—
When I teach my Agile Estimating and Planning class, one slide comes up that I stress is one of the most important in the class. It says “Estimate size, derive duration.” It’s about doing these in separate steps. We estimate size by saying “this work is such-and-such big.” We derive duration by dividing size by velocity. We can of course get fancier—working with ranges, confidence intervals etc. But that’s the general idea.

When you suggest measuring the square meters of the wall, you are using that as a measure of size. So in your approach you would be “estimating size, deriving duration,” which is exactly my goal.

This works because you picked a metric that will closely correspond with the duration of the work. But I’ll contend it could be quite wrong because some walls are dark colors now but we want to paint them white. Other walls have fancy molding on them or doorways or windows cut in them. Other walls held our taxidermy collection and we need to patch those big holes before painting. So your metric is one that would have a high correlation with total time spent painting but it wouldn’t be hard to factor in other things and come up with a better estimate. In factoring in those other things (presence of molding, paint color, holes to patch, etc.) the common element is the effort involved. I can’t add “50 square meters plus 3 big holes + crown molding.” The only common denominator for them is effort:  “Hmm, this 50 square meter wall with it’s 3 big holes and crown molding is probably about the same effort to paint as that 100 square meter wall with no extra factors.”

Alternatively, we could create a parametric estimation model by analyzing a lot of data and coming up with total paint time = square_meters * 1.5 + holes * 8 + linear meters of crown molding^2 * 7.2 = total paint time. COCOMO is the most famous parametric estimation model but there are others. These have never really caught on in practice though.

Jose said…

Hello Mike!

I understand what you’re saying, my example was really too simple. But, imagine that I could find such a function with parametes for every wall caracteristics that affects estimating (probably there isn’t too many parameters for walls - at least I could create/adapt a function for each painting project). Then I could use it to derive durations, without using time in estimations, right? I’m not advocating doing this in software, just to be clear that it’s possible to do estimations and derive duration without using time in estimations. But in software this is much harder to do. How to find such a function? COCOMO could be a way but it looks like too costly and with many constraints. It’s better to stick with expert judgment.

But, for my surprise, Boris Gogler posted yesterday an entry in his blog with the same subject (http://borisgloger.com/en/2010/10/12/agile-estimation-basics/), but with a thesis opposed to this post. He says that he “never estimate efforts, because that does not work”. To estimate a story one should “simply try to define the ‘dimensions’ of this story, which have nothing to do with the implementation of this story”.

Actually I’m a bit confused now. You two are very respected gurus in Scrum and Agile community, but it looks like have opposite opinions on this very important and difficult subject.

By the way, all the comments in your Blog are so good that you could create an entire book just with them. Thanks a lot for all your time and attention to us.

Mike Cohn said…

Hi Jose—
I have not read that other blog so I cannot comment on it. Yes, you could create a parametric estimation model but you say yourself you don’t want to because of the effort involved. That and the need for local calibration have been the challenges of otherwise great approaches like COCOMO.

Jose said…

Hi Mike,

I thought you could comment on another blog, not aiming to criticize, but to bring more wealth to the debate, more knowledge for the community.

It would be very interesting and helpful to us if you could give us your opinion about those ideas.

Freyr said…

If I understand the conversations above correctly, I see two sides here:

1. Improvements in the team are reflected in the fact that individual story estimates are gradually reduced.  So a story last year which was a 5, now has some process improvements and this year is a 2.  Velocity stays steady, story points per story decrease.

2. Improvements in the team are reflected in velocity.  So the story of 5 since last year is still a 5, but now we can do more such stories in one sprint.  Velocity increases, story points per story stay steady.


In #1, the only way for me to see improvement trends is to monitor stories which are comparable.  I can see that we are better now than before by comparing two similar stories done at two different times.  Velocity tells me nothing.  Additionally, after every sprint, stories which have already been estimated are potentially immediately out of date due to process improvements.
In #2, I can simply monitor velocity.

I like thinking in terms of physical distance.  I draw an analogy between the following two questions:

1. What is the story point value of this story?
2. What is the distance in meters from point a to point b?

If I were to ask 5 different people “How long will it take you to get from point a to point b?” I would get 5 different responses, a lot of discussion, and no valuable result.  The value I see is in the relation between distances.  If the 5 people all agree that the distance from point a to b is “1”, then when I ask them to estimate the distance between point c and d, they can do so pretty accurately, even if their premises are completely different.  This distance never changes; it is a constant.  So even if the premise of the estimate changes (e.g. you can now use a car to cover that distance) the actual distance is the same, and you measure progress by realizing that you can cover more of this distance in the same time than before your process improvement(car).

If I can cover 3 times the distance from a to b in one sprint by walking, then my velocity is 3.  If my process improves by introducing a car, then I can cover 30 times the distance between a and b, shouldn’t my velocity be 30?  Why would I take the individual story (get from a to b) and reduce its value from 1 to 0.1?

Nirav Assar said…

Mike, the concept of this post does not make sense. If story point are really about effort and effort is time, then what is the point of having some meaningless conversion. Why don’t you just use time as your estimate? When we say velocity in physics, it is d/t (distance over time). Story points is the equivalent of distance is software terms, and a story point is complexity. Velocity in agile is basically how much complexity can you tackle per sprint (iteration). If story points are just effort, 3 programmers on a regular work week can only cover 120 hrs of work. Your velocity never increase.

Thus your points makes no sense. I think you are way off on this one.

Mike Cohn said…

Hi Nirav-
I’m sorry the post made no sense to you. I stand by its points though (pun not intended there). This post was clarifying how people misuse points and think of them solely in complexity, which is a wrong use. I didn’t attempt in this post to cover the advantages of points. But that did come up in at least one of my many replies to the comments on the blog. For example, see my reply on October 10, 2010 at 12:54 pm above.

As for your point about velocity in physics: You are absolutely right. I hate that we call this velocity but that is a legacy term from the original XP teams. I prefer to call it “pace” and when I teach these concepts in classes I do mix that term in along with the standard “velocity.” But, I don’t like fighting lexicons so I use the fairly standard terms such as velocity instead of pace and “product backlog” instead of what my earliest Scrum teams called the “prioritized features list (PFL)”, which I also think is a better term.

Veselin said…

Mike, I have one big problem with story points, not sure you are aware of this. I see many teams new to agile doing estimation of every single story, and I mean hundreds of them (in one team close to 1000) prior to the project start. Reason being that teams are asked how long is going to take them to finish the project. So obvious answer is, let’s estimate in story points every little thing, we know the velocity and voila, magic happens.
What happens in reality: project stalled for weeks, and only thing I hear is poker games and story points. Actually I had even one team complaining that they are slowed down because now they do agile stuff :)
BTW, I do know what Craig Larman says about it (and discussion about workshops and finding the right balance when to start project) but i want to hear your opinion about it too. I wonder whether other people have seen the same?
Thanks

Mike Cohn said…

Hi Veselin—
I don’t see how that is specifically a problem with story points. Those same teams—in their pre-agile days—would have spent the same multiple weeks doing a big task decomposition to try to appease some boss with a “perfect” estimate.

And, yes it is critical to find the right balance between anticipating and adapting to changes. I have a post specifically on that topic at http://blog.mountaingoatsoftware.com/balancing-anticipation-and-adaptation

Arran Hartgroves said…

I find the complexity story pointing approach commonly, but I believe teams often sub-consciously relate this to effort when assigning relative values so hopefully this misconception hasn’t caused too many estimation issues. I feel sorry for the guy that did brain surgery 1 sprint (based on complexity), to then take on licking of stamps the next sprint! Great analogy by the way.

I suspect the complexity approach is widespread, has this been passed on through some common syllabus or book, or is this just a common misconception?

Angry Ashley said…

Personally, I think this article is just plain wrong. There is a difference between complexity and a different outcome in mind. In my view complexity should only ever be used to give an indication of duration. The velocity planning is designed to give a measure of duration when team size, particular BA/Dev/QA skills are all taken into account. This is different from complexity for a reason – the relative complexity of one story next to another will change less often. If you get two iterations into a release and think your velocity is slow so you add another dev, do you go through and re-estimate all the effort, or, if you have a relative complexity, do you just apply a shift in the velocity across all points? In most Agile environments the near-enough reapplication based on relative complexity and velocity is the right measure. If your project is time critical or particularly constrained then you may go to the extra effort of recalculating effort but rarely is effort actually estimated for anything but the next few stories. If you need effort to be calculated for more than this then you are losing some of the benefit of being “just-in-time”. Whether or not this is fine depends on the risk profile of the project.

Mike Cohn said…

Hi Angry Ashley—
I believe you have missed a couple of key points in what I’ve written, which is quite understandable as some came out only in the lengthy discussion that followed the initial post. You are correct that “velocity planning [I’d call this “release planning”] is designed to give a measure of of duration when team size [is] taken into account.” Yes—that’s duration. What this post is about is that story points are an estimate of *effort* and that duration is derived by dividing effort by velocity. Since velocity will change (as you allude) as team size or skills change that will change the duration produced by that division. But the size of the team does not change the effort involved. Effort is independent of team size, skill, etc.  So your comment about “you may go to the extra effort of recalculating effort….” is wrong. Effort will be the same. This would be a recalculation of duration.

Enric L. said…

Mike, based on what you say in this blog, does the sequence of estimating user stories affect how those are estimated?
For example: If I have 10 user stories that are exactly the same in nature, lets say 2 story points each, but as soon as I finish the first one, the second one will become much easier and faster to complete. Should we then assign 1 story point to the second one, third one, and so on? or all should remain 2 story points regardless.

Mike Cohn said…

Hi Enric—
Here’s the way to reason through to an answer on a whole class of questions just like the one you pose: The sum of the estimates on the product backlog items should always represent the true overall size (as estimated) of the product. With that in mind, if we estimated all the items you describe as 2 points each we would overstate the size of the product to be built. We’d call it 20 points when the real size is presumably 2 + 9x1 = 11. So, the right answer is to put the lower number on 9 of the 10 stories, put 2 points on one of the stories. And if it’s somewhat undetermined which of the 10 will be done first consider putting notes on the stories like “Assumes this is done after story x.”

Freyr said…

Ok, so process improvements (or refactorings, etc) can reduce the effort of a subset of subsequent stories.  What does that leave for reasons for potential velocity increase?  Excluding the process improvements which would affect every single type of story (where it would be easier to just leave the stories as they are and watch velocity grow), does that only leave things which are directly related to personnel?  Like adding more team members, training them, improving teamwork and communications skills, etc.?

I wonder about this because I do not want to re-estimate my entire future backlog every time we make a small process improvement (which for our teams is at least once every two weeks).  I wonder if it makes sense to catalog each improvement and attempt to identify if it directly affects a certain type of story, to get an overview of which stories would (potentially) need to be re-estimated.

Mike Cohn said…

Hi Freyr-

I don’t see why you would need to re-estimate your entire product backlog. I suspect you are trying to do things with velocity that are best not done. Although in concept you should be able to look at a team last year and this year and say something like “this team is 14.2% faster” that is unlikely to be the case. For that to be true a team would need to estimate this year’s user stories relative to last year’s user stories but do so with last year’s capabilities in mind. It’s hard enough estimating with our current capabilities in mind. An example:

Story A is done last year and was estimated at 5 point.
If Story B had been estimated last year at the same time everyone would have called it 5 points as well.
But Story B isn’t estimated last year, it’s estimated this year.
And the team has improved—more training, better at testing, read a few good Java books, etc. They estimate Story B and call it 4. They justify that estimate because “we can do Story B right now a little faster than we did old Story A.”

The team then starts a sprint into which they bring Story B (4 points) and lets say Story C (1 point). THey have a velocity today of 5 points. But in some sense we can say they should have a velocity of 6 points and that they are 20% faster.

Your argument will likely now be that since your team improves every sprint (congratulations) your team should re-estimate Story B if it had indeed been estimated a year ago. Not to do so would, for your team, overstate velocity. However, it should be pretty rare that a team is doing a year-old story. If they are, the story should have been an epic that was broken into smaller pieces and re-estimated. Further, the potential problem here is easily enough fixed by saying “We will re-estimate any story over a year old.” So you wouldn’t be re-estimating your entire product backlog, only ancient items that got left undone for a long time, of which there should be very, very few.

But to be clear, I don’t recommend doing any of this because I’m not wild about the idea of using velocity as a way of measuring team productivity improvments.

Freyr said…

Thanks for the reply.

We try to keep a forward-looking cone, so that e.g. the next 4 sprints are estimated and broken down to enough detail to pull, the next 4 slightly less detailed, etc.  What tends to happen is that stories get rearranged (partly due to other problems, e.g. not planning for iterations, but also due to legitimate reasons), which might mean that some stories might “circle” around sprint 3-5 for a while before finally reaching sprint 1.  This means that the estimates for these stories might be 3-6 months out of date.  When they do come closer (sprint 1 or 2) of course we re-estimate them and often rewrite them, but while they are circling, they have not yet reached our focus point of re-estimation.  Which means that they represent an increasing risk to planning.  I could add an “age” factor to each story, and “force” a re-estimation (or re-evaluation) every x months, but I am always wary of introducing more complexity to our backlog.

How would you deal with the risk factor such stories impose on project planning in light of constant improvement of teams?

My initial concern was indeed directly related to using velocity as a way of seeing (not necessarily measuring) that a team is improving.  How would you prefer to measure/show team improvements? 

I realize that my questions break the scope of this post, but I would be much obliged if you could point me in the right direction.

Thanks,
Freyr.

Mike Cohn said…

Hi Freyr—
It’s hard to propose a solution to your re-estimating problem without being there to delve much more into its specifics. In general I don’t re-estimate a lot. I want the product backlog items to all contain about the same amount of uncertainty, otherwise it messes with velocity and predictions you can make from it. I’ve written about re-estimating elsewhere on this site.

As for measuring productivity, the best approach has been that done by Michael Mah using Larry Putnam’s productivity index. If you search for Mah you’ll find some papers by him. There are also references to him on the http://www.SucceedingWithAgile.com site in the presentation in the Resources section.

Freyr said…

Thanks for the ultra-swift response!  I wasn’t expecting a reply to such an old post for a while.

I appreciate the nudge back on track.  Thanks.

Jake Gordon said…

I’m really late to the game but what a great discussion! I’m suprised it stirred up so much controversy. My takeaway is simply that when factoring complexity and uncertainty into your story point estimates one is ultimately just stating that a story is likely to take longer because it is complex or because there is probably additional work that is not yet understood. I have trouble imagining circumstances where complexity does not equate to time, at least indirectly.

This post also lead me to another excellent post that answered some other questions that were haunting me: http://blog.mountaingoatsoftware.com/to-re-estimate-or-not-that-is-the-question

Thanks for the insight!

Leave a Comment: