Story Points Estimate Effort Not Just Complexity

A client asked me, "When will my team be done with this project?" This is probably the bazillionth time I've been asked that agile project management question in one way or another.

I have never once been asked, "How hard will my team have to think to develop this project?"

Clients, bosses, customers and stakeholders care about how long a project will take. They don't care about how hard we have to think to deliver the project, except to the extent that the need to think hard implies schedule or cost risk.

Teams Think Story Points Are Just Complexity

I mention this because I find too many teams who think that story points should be based on the complexity of the user story or feature rather than the effort to develop it.

Such teams often re-label "story points" as "complexity points." I guess that sounds better. More sophisticated, perhaps.

But it's wrong.

Complexity is a factor in the number of points a product backlog item should be given. But it is not the only factor. The amount of work to be done is a factor. So, too, are risk and uncertainty.

Taken together these represent the effort involved to develop the product backlog item.

An Example of Why Points Can’t Be Just Complexity

In a Scrum Master course a few years back, I was given a wonderful example of why story points cannot be just complexity. Let me share it with you now.

Suppose a team consists of a little kid and a brain surgeon. Their product backlog includes two items: lick 1,000 stamps and perform a simple brain surgery -- snip and done.

These items are chosen to presumably take the same amount of time. If you disagree, simply adjust the number of stamps in the example.

Despite their vastly different complexities, the two items should be given the same number of story points because each is expected to take the same amount of time. In this carefully chosen example, the volume of work (licking 1,000 stamps and one snip to some part of the brain) and the complexity of that work combine such that each will take the same amount of time.

So complexity is a factor in the number of story points assigned, but only to the extent to which that complexity increases the expected effort.

Adding Risk and Uncertainty

Remember I said it was a simple brain surgery. (I’m not even sure that exists, but go with me for another minute.)

If there was a risk the surgeon began the surgery and discovered he also needed to do some extra work during the surgery, that would be factored in and the estimate increased. So if we know a surgery will be simple we might call it 5 points. But if there’s a risk the surgery uncovers other work to be done, we would increase the estimate to a higher value.

The Right Person for the Job Should Do The Job

Our example of the surgeon and the little kid points points out another aspect of agile estimating. We should assume that the right person for the job will do the work.

We do not assume the little kid will finish school, go to med school, do a seven-year residency and only then begin the brain surgery while we have a skilled surgeon sitting in a cubicle licking stamps.

Of course reality intrudes and occasionally the "wrong" person for a job does the job, but that will rarely be as dramatic as in this example.

Story Points Are Effort

So, story points are an estimate of the effort involved in doing something. That estimate should be based on a number of factors, including the volume of work, the risk or uncertainty inherent in the work, and the complexity of the work.

But story points are not solely about complexity.


A Free PDF to Help You Choose the Approach

A Free PDF to Help You Choose the Approach

I’ve created a PDF you can download that will help you decide which approach is best for any story you’re adding detail to. It also includes examples of the two approaches.

Download my PDF

219

Posted:

Mike Cohn

About the Author

Mike Cohn specializes in helping companies adopt and improve their use of agile processes and techniques to build extremely high-performance teams. He is the author of User Stories Applied for Agile Software Development, Agile Estimating and Planning, and Succeeding with Agile as well as the Better User Stories video course. Mike is a founding member of the Agile Alliance and Scrum Alliance and can be reached at [email protected]. If you want to succeed with agile, you can also have Mike email you a short tip each week.

219 Comments:

John Sonmez said…

Hi Mike,

Love your book.  I agree with you mostly.  Except for one little catch.
What happens when something goes wrong?

In your example with licking the stamps, it is easy to fix, if you mess up licking a stamp, you just re-do that one.

If you mess up in the brain surgery, that simple task could become huge.  Or even if you don’t mess up, when you get inside that brain you could find that what you thought was simple was very complicated.

So, I would say that I agree with you that effort points are the amount of effort, not the complexity, but… the only thing I would suggest adding is to say that you have to factor in complexity to a certain degree, because complexity = variability. 

The random unknown doesn’t sprint up too often when licking stamps, but during brain surgery, it is much more likely.

Excellent point though.  I always thought it was strange when people say “effort points” have nothing to do with time.  That is completely bogus.

Tobias Mayer said…

Nice post. Thanks for the clarity. Because we are not estimating in time units many people assume we are not measuring time. Your example helps remove this confusion.

David Bland said…

I tend to view the relative sizing of User Stories in terms of Complexity, Effort & Doubt that roll up into a “size”.

Am I in the minority on this?

Mark Kilby said…

David… you are not in the majority and I’m surprised by this stance from Mike.  I teach teams that points are ALWAYS a measure of complexity, effort, and doubt.  Otherwise, those adopting agile and story points (teams and management) have a more difficult time seeing the benefit of points.  If it’s just another measure of effort, why use them?  If it’s a different measure, then we have to think about them differently and that implies we have to conversations about those differences.  Those conversations are key to an agile adoption.  Once a team has a velocity established, you can start seeing how the points map over a time and you get a better sense of effort.  But if it’s just a measure of effort, I can see where you will have some folks that will try to map points to hours, compare points between teams, and start to use points as just another stick.

Andrej said…

I didn’t expect this. I always thought story points are a way to measure the size of the requirement, not the time it takes to implement the requirement.

If story points are about the time it will take to do the work, velocity doesn’t make sense anymore. Something which takes the team 10 story points in the beginning of a project, might take the team 5 story points at the end of project because of improved experience. if you measure story points in time, the team might be twice as productive, without this being visible in the velocity.

On the other hand, if story points measure size of the requirement, velocity will increase as the team gets more productive.

Tobias Mayer said…

@David (and others) Story points are “size points” and map to time retroactively. If they didn’t how do we make release predictions? They provide empirical data for making predictions. Mike’s point, clearly made I thought is that it isn’t about sizing complexity, but about sizing effort.

@Andrej As a team gets better they get more story points done because each point takes less effort due to improved tools and process. Velocity makes perfect sense: it shows that improvement.  Relative to other stories the sizes remain constant.

Mike Cohn said…

Hi John—
You’re absolutely right to bring up the issue of what happens when something goes wrong. When we estimate the effort involved in something, one of the things to consider is the uncertainty inherent in the user story or feature being estimated. Sometimes something complex may not have a lot of uncertainty around it (“I’ve done this brain surgery a million times; it’s a no-brainer” says the surgeon. “One hour and I’m done.” More often a complex thing will have more uncertainty. And I’m sure we’d see a correlation—the more complex, the more uncertainty.

If the uncertainty is small relative to the total effort, a common approach is to make the estimate a small amount bigger. (“It almost also takes 60 minutes but occasionally 65 so I’ll say 65 to be safe.”) However, if the uncertainty is greater then a more common and appropriate approach is to use two estimates. In the Agile Estimating and Planning book I wrote about two-point estimating and suggest estimating a 50/50 number for the story and a number you are 90% confident in. That book also shows how to work with estimated ranges and turn them into plans.

Mike Cohn said…

Hi Tobias—
Thanks. I’m glad you liked the post. I just love the example of the brain surgery / stamps and it was too good not to use there. I wish I remembered who gave it to me after a class one time.

Thanks also for helping to clarify my points on here. I know you totally get the difference I’m after and appreciate your input.

Mike Cohn said…

Hi David—
Thinking of points as a function of Effort, Complexity and Doubt is fine. In my reply above to John I just combined Complexity and Doubt into one thing: Uncertainty.

Points are a measure of how long it will take (effort). How long it will take can be affected by other things and those can influence our estimate. The key is to remember and understand that it is always about time—no client ever cares how hard we had to think, only how long it took.

Mike Cohn said…

Mark—
Points are just another way of estimating effort. Other things can affect that estimate to the extent they effect effort. But points are about effort. There are many, many advantages to points beyond being sort of chimera of effort and other factors. See Agile Estimating and Planning for many.

One of the most compelling reasons to prefer story points even though it is another way of estimating effort is because it allows individuals of different skills to discuss the relative size of work. My favorite example is of two runners at the start of a trail—one says it will take 5 minutes to run, the other says 10 minutes. Both are right. Those are the amounts of time it will take each to run. There is no time-based argument they can have to settle on the “right number.” However, both can agree that some other trail is twice as long—-one runner will be thinking “Yep, 20 minutes” and the other will be thinking “Yep, 10 minutes.” But both can agree “twice as long.”

John Sonmez said…

@Mike

Thanks Mike, what you say makes sense.  I need to read you Agile Estimating and Planning book.  I read your Agile User Stories book, and it changed my perspective on user stories.

Mike Cohn said…

Andrej—
What size would there be if not effort / time to implement? Is there some other size that anyone would care about than time?

In practice what you’ll find from looking at a long-time agile team is usually a very moderate increase in velocity—they may know they are 3x as fast as when they started but velocity may be up only 20%. And that is often attributable to increased focus on work.

Mike Cohn said…

Hi John—
I’m glad you enjoyed the user stories book and hope you like the estimating one.

Mark Kilby said…

@David… My apologies for the typo… I meant to say “I don’t think you are in the minority”...but I could be wrong.

@Mike… Yes.. I’m very familiar with your Agile Estimating and Planning book since it emerged in early drafts.  My concern is that the tone of your post will cause those still new to agile to assume points = effort = hours.  Yes, effort is a key component of points, but the “uncertainty” is important as well.  Also, it’s important in describing points to emphasize the relative measure.  In your example of the stamps versus brain surgery, a team assuming points as effort would size those stories the same.  But what if the surgeon is out sick during the sprint and the team “committed” to completing all the stories.  I think if uncertainty was a stronger consideration for the points measurement, that team may not size them the same.  That’s where I think your 2 runners example is a better example.

Maybe we just need to agree to disagree on this?

Amanda Varella said…

Mike, I’ve read Agile Estimating and Planning, and after your post, I think I will read again! If the time must be considered in the points estimates, what’s the purpose of using points? It wouln’t be the same of using ideal hours? Ideal days?

Ken Clyne said…

Mike, I’m enjoying your new book.  I can picture you sitting in an open air restaurant in Mexico writing Chapter 1.

Anyway, I agree that sizing is about effort but there are different types of effort.  For example, there is the writing of the code and there is all the thinking (and talking) about the writing of code.  Take the coding of an insertion sort algorithm vs. coding a quicksort algorithm.  Same number of lines of code.  However the quicksort is much more difficult to write, requires more thought and there is a much greater chance that it will need a rewrite.  I find that it helps to talk about complexity and doubt to get people to think of the larger picture and include some of the intangibles in their sizing discussions.

Mike Cohn said…

Hi Amanda—
Let me start with a premise that the goal of estimating is to answer questions such as “When will you be done?” or “How much functionality can we have by the given date?” If that is true, then whatever unit we estimate in and whatever approach we use will need to be about time.  There are many advantages to using an entirely abstract unit such as story points. See, for example, my reply to Mark, on June 21 at 1:07pm above (about the runners).

A good, cheap way to find out more of the advantages is to look at this video on Agile Estimating and Planning that I gave at Google a couple of years ago.

Mike Cohn said…

Hi Mark—
Ah, but points do equal predicted effort (as adjusted by everything we know/don’t know about the item, including its riskiness, uncertainty, etc.) But those are always part even of any effort estimate. If you ask me how long it will take me to go to the market and bring home some fruit, my estimate will be, say, 15 minutes. In coming up with that I thought about how long it will take to get there, the likelihood of traffic problems, the likelihood of a slow checker in the market, etc.

If the surgeon is out sick in the example, the team would be unable to do the work. No example should be stretched too far—-for example, the biggest problem with the example is that it is of work done by a single person in both cases. A typical product backlog item would be worked on by 3-5 people. So the work here (stamp licking, brain surgery) is more tasks rather than user stories (product backlog items). And I don’t advocate using points for tasks on the sprint backlog. So, the example breaks down when pushed too far (as most do).

Mike Cohn said…

Hi Ken—
I’m glad you’re enjoying Succeeding with Agile. I wish I’d written some of it in an open-air restaurant in Mexico!

You are absolutely right that complexity and doubt need to be considered when trying to decide how much effort something will take. But those influence the number of story points assigned only to the extent that they influence the total time the story will take.

Another simple example about why complexity does not matter *except* to the extent it affects the time the story will take: Suppose I give two lists of numbers. One lists is a series of additions between two one-digit numbers, like:
4+3=
2+6=
etc
The other is of one-digit numbers to multiply:
1*8=
3*2=
etc
I have no real knowledge of this but I’m gonna swear that more synapses fire in my brain to do multiplication than for addition. Since every schoolkid is taught addition before multiplication it must be easier / less complex. However, I’m pretty sure I could process the two lists in the same amount of time—-so they would have the same effort-based estimate whether that is in points, seconds, whatever. They’d have the same estimate even though I would think harder on the multiplication.

Now if the list were of numbers I could get wrong (12*13=) I would very likely increase my effort (point) estimate on the multiplication to account for that. But that is an example of increasing the estimate because the expected time to complete the work went up.

Alida said…

Hi Mike,

I frequently encounter team members who try to estimate a story by rolling up the tasks.  That is a common trap people fall into when they think in terms of efforts. 

On the runners example, I can see that they would estimate the second trail is “twice as long” in relation to the first one.  But how do they estimate (and agree on) the first trail if they have to run it together?  I am not sure if the analogy breaks down here, but basically how would they estimate a story they have to do together if their ability are so different?

Mark Kilby said…

@Mike

I agree that points correlate to time… my concern is people will read this post and conclude that a certain point value will map to a certain number of hours and I think that would be a misinterpretation.  So I will agree to disagree that points EQUALS effort, but I will agree that it correlates to effort.

In my experience, I have seen where a certain point value maps to a range of effort hours (for a team) and that range varies from team to team and can change if the team or conditions the team works within changes.

On a different note, I’ve found all 3 of your books most valuable and they are always on my “top recommendations” list.  That’s why I feel so strongly about those new to agile not misinterpreting this article.

Marcello Leonardi said…

Hi to all
I read both books of mike and I’m practicing the estimation suggestion since two years now.
I totally agree with Mike. At the end of the day the stakeholders are interessted in time.
If you develop a product for a customer (project based) then he is interessted in time, because he has to align his marketing actions and at the end he is interessted in the cost of the project, because he has a fix budget which he can spend for building the product.
To get to the cost we need the time that a team will spend on this project until they deliver the product.
To get to this time you can perfectly work with story points. Like discribed in the books of Mike we have two possibilities:
1) You have a team that already worked together and can estimate a complete backlog of a new product in Story Points and you know what this Story Points means in terms of time and at the end cost, because you can extract it from last projects.
2) You have a new team that not worked together, then you have to find out what one Story Point does means in the dimension of time and at the end cost.

I started in the begining of the year a new project (half a year later we are live with release 1.0 - http://www.home.ch/en/ 😉. For this project to get to the time and at the end to a cost estimation I pushed two task to the team, after we had estimated the backlog using story points:

1) Please break down two Stories (first Story was 1 Story Point and second Story was 2 Story Point) into Tasks an estimate the tasks with hours you think you have to spend on.
2) Put together a Sprint 1 Backlog that you think you will fullfill.

For the second part I put together a sheet with all the ressources which I had for the first sprint. So I knew how many time my team is going to spend on the Sprint Backlog.

With this results and with the estimated release 1.0 backlog I could told to the customer that one Story Points will take 2.5 to 3.5 days -> this means xxx to zzz $ cost.
In addition I could tell him that we will take 5 to 6 Sprints to get the backlog done. My customer accept the range, because he understand that we will get much more precise after every Sprint. Then after every Sprint we evaluate our estimates against the facts (results).
So like you see in my example I really believe that estimating with Story Points will work and the translation in time and at the end cost will work.

The most challenging part of my initial process was the relative estimating of User Stories.
If you don’t have much details at the starting of the project (we had in addition to the user stories, some notes in the user stories, some first wireframes and a draft design idea, which by the end of the project changes completely) it is very difficult to estimate in relative.
This was for us the much diffcult part in the estimation process. The mapping from points to time and at the end to cost is more or less mechanical work 😉

Andrej said…

@Mike
If the surgeon gets some new tooling which allows him to do the operation in half the time he needed before, would you say that the effort is also cut in half, or is the effort still the same, and is the velocity doubled?

Mike Cohn said…

Hi Alida—
You asked how the runners would agree on an estimate if they have to run the trail together. Well, (a) they probably wouldn’t agree because here we are talking more about a task than a user story. This is why I don’t recommend using points for tasks. However, (b) if we think of “run this trail” as more of a user story of perhaps “run three kilometers” we would split that into two tasks—-one runner would run one kilometer, the other would run 2 kilometers and they’d finish at the same time in the end.

Mike Cohn said…

Hi Marcello—
Thanks for sharing the story of your successful use of story points.

Abhilash Pandey said…

Hi Mike
There is no doubt that the story points will retroactively map to certain number of hours but doesn’t directly calling the story points as a measure of effort completely defeat the purpose of story points, which is relative sizing? I always take story points to be based on what you suggested in Agile Estimation and Planning i.e. How Complex it is or How big it is. I tend to add doubt as third factor there. With this post, I am sure lot of people will draw inspiration that if story points are a measure of effort then why not use hours, ideal days etc. and not use story points which will bring the teams to a disadvantage of not getting the benefits of relative sizing. I remember another post by you long back on a similar note where you mentioned that there is no one to one relationship between story points and hours (effort) and it is actually a normal distribution. I believe that was a great explanation of relation between story points and effort and I quote that example to every team I mentor on Scrum. Somehow, I find this post conflicting with the the earlier post about normal distribution (http://blog.mountaingoatsoftware.com/how-do-story-points-relate-to-hours) and a consensus on story points at several other places.


On a side note, How does the team of a little kid and Brain Surgeon estimate the size of two items viz., lick the stamps and doing the brain surgery using planning poker? For licking the stamps, the kid might think of just licking the stamps and gives an estimate of 1 but the Brain Surgeon will think of the hygiene factor and will find a dependency of creating a stamp damper and estimates it as 5. Then for brain surgery the school kid estimates it at 40 and the Brain Surgeon estimates it to be a 3. The surgery can never be a 3, 5, 8….for the school kid, so how do they converge?

Mike Cohn said…

Hi Andrej—
If the surgeon gets a new tool that cuts (pun!) the time in half, then the # of points would be halved as well.

Mike Cohn said…

Hi Abhilash—
Yes, in Agile Estimating and Planning I wrote about the points being about how “big” the story is. I wasn’t as clear in there as I have been in this post. It is all about the amount of time something will take. No client/customer/stakeholder cares about complexity *except* to the extent that complexity affects my estimate of effort.

I’ve already addressed in other responses here that there are many, many reasons why story points as a measure of effort still have tremendous advantages over other things like ideal time.

I don’t see a conflict at all between this post and the one saying that points map to a range of hours rather than an exact value.

In estimating if the team is just the kid and the surgeon, the kid would defer to the surgeon’s expertise on the surgery and perhaps both would collaboratively estimate the stamp licking since both could relate to it.

Prashant Pathak said…

Mike,

Thank you for writing about this. With all due respect, I would differ with you !

I think Story Points make sense when we talk about features and complexities and are trying to guage what the team thinks of that “Story”. Tasks on the other hand can be estimated in number of hours.

I understand when you say that, complexity is not what the clients want and I will have to get the estimated hours from the past to give a good estimate.

In practice, I use 3 measures
1. Story Points—measure of complexity (not in hrs.) useful for the team to discuss the complexity of the issue
2. Estimate in hrs.—could be a total of all the task estimates—good for estimating the time required for future tasks to the client
3. Cumulative Business Value—This is probably the most interesting number we can get to the client

Your thoughts are welcome.

Michael said…

Mike,

Very interesting post (and discussion). I have found your books to be some of the best resources on agile. But this post feels really misleading.

You say that Story Points should not be re-branded as Complexity Points, and used primarily as a measure of complexity. THAT I agree with.

But, the stamps/surgery analogy and some of your statements effectively say that “Story Points = Effort = Time”. That equation assigns Story Points a level of precision that they simply don’t have. It also assigns story points an absolute rather than relative value.  Abhilash and Mark correctly make this point. Story points may be CORRELATED, but are not directly equal.

The simple fact is that when estimating story points, the team (even a seasoned team) will not accurately know the level of effort for a story. What the team will be able to do accurately is rate a story’s effort, complexity and risk (or other factors that are helpful to the team—we use “risk”) relative to other stories they have completed or planned. Effort, complexity and risk are tools/criteria used to relatively evaluate stories.

YES, ultimately it is all about time/effort. But people are terrible at estimating time/effort. SO, we use relative story points that look at relative complexity, effort, risk, doubt, etc among stories. OVER TIME, those will prove out and be useful for planning purposes (and thus to the client).

But the bottom line is that I think it is absolutely useful to think and estimate in terms of complexity, risk or any other criteria which help you estimate relative story points. Consistency is more important than precision.

And in the end, consistency, and velocity (over time) are what will deliver the value clients want to know (when will X be done).

Sure, you are right in the end. But, I find this post misleading from a practical point of view, and likely to lead agile teams astray.

Mike Cohn said…

Hi Prashant—
It’s perfectly fine to disagree with me. However, from your comment I don’t see anything relevant I disagree with. I, too, estimate tasks in hours directly. I do that because it’s often fairly clear with team member will do a task so we don’t need to argue a lot of “it’s 4 hours,” and “no, it’s 8.” For story points, we use an abstract unit and estimate relatively. We can both say “Story A will take twice as long as Story B” meanwhile you are thinking a completely different amount of effort is involved than I am.

Tobias Mayer said…

@Prashant and others who want story points to be a measure of complexity.  My question would be: then what?  So they measure complexity. How is that useful to the business?  How does it help make predictions or do release planning?  The beauty of relative effort estimates is they eventually map directly to time, and to cost.  We can very soon make realistic date and cost estimates based on empirical data. I can’t see how you can do that if you are estimating complexity.

Of course, the ultimate goal is to have all stories approximately the same (small) size, i.e. take approximately the same time to develop, so we don’t need to estimate them. Velocity then simply becomes a count of stories completed per sprint.

Pretty much all software development is complex, singling out certain stories as “more complex” may trigger some good dialog, but is unlikely to be helpful to the business.

Mike Cohn said…

Hi Michael—
I’m glad you liked my books and am sorry you find this post misleading. I actually think the post is turning out far more useful than I’d thought initially in that it is uncovering some strongly held but incorrect understandings of what a story point should measure.

It was actually Mark above who wrote “points = effort = hours”. You are taking me to task for not saying points are correlated with hours. However, I agree with that. Points do not *equal* hours because points are an estimate. That is why I previously blogged that there is no equivalence relationship between points and hours. See http://blog.mountaingoatsoftware.com/how-do-story-points-relate-to-hours 

What I’m saying in this post is that points are an estimate of effort (“how much time will this story take?”). Perhaps the way to say that is that points are a function of effort, risk and uncertainty, or SP = f(E, R, U). (Call one of those complexity if you want; it’s not important.) The idea is that points are an estimate of the effort involved. Risk, uncertainty, complexity, doubt and other things people have mentioned here can be incorporated BUT only to the extent they affect the expected effort. If something is complex but that complexity will not affect the time to implement the feature, that complexity should not affect the estimate—-that was my point with the lists of numbers to be multiplied or added.

You wrote that “it is absolutely useful to think and estimate in terms of complexity, risk or any other criteria which help you estimate relative story points.” I agree but, as in the prior paragraph, only to the extent those things affect the time the story will take. Complexity that doesn’t affect effort is not relevant.

I have no idea why this post will lead agile teams astray. I have been using this example in classes and consulting now for I believe 3 years. It always serves as a point of tremendous clarification. Far more likely to lead teams astray is when someone re-labels story points as “complexity points.” That pushes a team into thinking that complexity is the sole factor in sizing the story.

I witness these continually in my Agile Estimating and Planning classes. In one exercise I ask teams to estimate a list of things that includes, “Wash your boss’s Porsche” and “Read a 10-page academic article about Scrum.” Those are probably somewhat similar in time; maybe one is 2x. When teams in my class start asking “how complex is this?” they will end up saying that washing the car is mindless work and it then gets a much lower number than reading the academic article. Say 1 and 5. Yet both take the same amount of time. (If you disagree alter the size of the paper or car; it won’t matter.) When the team runs a sprint of doing those activities they will find that they take the same amount of time and therefore should have been given the same estimate.

Jean Tabaka said…

Hey all,

Is anyone else reading these comments thinking about systems. Not to pile on, but, to pile on 😊 complicated system is different than a complex system. So complexity counts for something at this very fundamental level. To round out this pile, there are also simple systems and on the other end of the spectrum chaotic systems. How are you going to declare points? Well, in a simple system, it is pretty darn easy. A straight forward solution awaits you, easy to articulate, easy to deliver. In complicated system, we’ve got a solution; it’s just mighty complicated. Still, we know the solution to a problem. In a complex system, the solution set just isn’t that easy to declare; it is beyond complicated. It has difficulties that bring ambiguity. Is complexity then the same thing as doubt? I don’t care. I just no that my problem set is beyond complicated, so I have a lot more to think about before I respond to how I can deliver a solution. In chaos, nothing on this earth will help you provide point guidance IMHO. Complexity represents that differentiator between what is merely complicated and what has the ambiguous difficulties inherent in complexity.

I learned something of this from a presentation given by Jim Sutton at the Leans Systems and Software Conference and the work done by David Snowden on Cynefin. I may have it wrong. Like points, I probably do have it wrong.

Michael said…

Mike,

Thanks for the great response, and you’ve convinced me we actually do agree.

On SP = effort = time, I actually understood that directly from the post where you say: “the two items should be given the same number of story points–each will take the same amount of time.”  Although your clarifications now make it clear, that line seems like it is saying that SP = time directly.

The thing that is confusing here is that TIME is only known in retrospect. So, I find your responses much clearer than the original post.

The other thing I find confusing in both the original analogy and your car wash / reading example is this: I think they are confusing complexity and difficulty.

I (and my agile team) generally understand complexity to mean “number of moving parts”. In the car wash / reading example, both are equal complexity. One may be more mentally challenging (difficult), but both are straight forward, can be done by one person, etc.

The stamps / surgery analogy feels disingenuous. The brain surgery is inherently more complex (if we are being realistic). It requires a complex lab setup, nurses, anesthesiologist, and maybe a consulting physician by video conference. So my point is this: If we look at this and say “the stamps will take about 2 hours and the surgery will take about 2 hours”, and give both the same story points (i.e., 5), in the long run we’ll get burned. The greater complexity of the surgery will tend to mean over time, this type of operation will average maybe 4 hours (one time the anesthesiologist shows up late, the next time the video conference doesn’t work right and so on.)

I think you nailed it for me when you say, “I agree but, as in the prior paragraph, only to the extent those things affect the time the story will take.”  My whole point is that people are bad at estimating time directly. So, we’ve found it most useful to compare the relative effort, complexity, risk, etc. of stories. In retrospect, that method helps us estimate pretty accurately so that indeed we are accurate about how much time it takes.


Tobias:  I am definitely not in the camp of saying we should “estimate complexity”. My only point is that the post feels like it is saying “lets not care about complexity, lets just estimate effort/hours”. I totally agree with Mike and you that the only point of the whole exercise is to figure out how long something will take. I am just saying we find the best way to accurately estimate effort is to compare complexity, risk, doubt, etc. They are just tools toward the same goal. 

In any case: Mike, thanks so much for your time and this great discussion. Definitely provoked some deep thinking.  Hope to run into you around Boulder some time!

John Mc said…

Mike,

I’m Scrum mastering a couple of teams in a shop that’s newish to agile estimating an planning.  These types of questions are haunting us.  Let me set up a scenario and see what your thoughts are.

Let’s say that we have a Scrum team whose product owner wants them to wash 7000 Porsches.  That’s too many Porsches to wash inside of one one-week sprint, so they work with the product owner to split the story into 700 stories with 10 Porsches each.  The team has never washed any cars before, but they understand the process and are confident that they can knock out 10 stories (100 Porsches) per sprint, so they assign 5 story points to each story and accept the top 10 most valuable stories into their first sprint, for a planned velocity of 50 story points.

Now, as the team goes through their sprints, they are going to get better at washing Porsches.  What I understand you to be saying is that they should re-estimate their user stories as they get better at washing cars, but their velocity in story points should remain constant at 50 (or very close to it).  What makes more sense to me is that you let the estimated stories stand but let the velocity rise as the team gets better at washing cars.  Likewise, if the water company cuts water pressure to the car wash site, it will take them longer to fill their buckets and thus slow them down.  Do they need to re-do their estimates, or just let it wash out in their velocity?

This is obviously an over-simplified example, but in cases where software teams are doing relatively repetitive tasks (like, say, convert a web app’s UI technology from technology A to technology B working page by page), the first one they do will take more time than the second, if the second and the third have a similar complexity (oops, sorry, couldn’t avoid that word there).  This is similarly complicated by the fact that staff turns over, thus “re-setting” the learning curve.  Would the necessitate a full re-estimation of the backlog?

We are really wrestling with this stuff when we do our sprint planning, so I appreciate the blog post and everyone’s great comments.

Thanks,
John

Mike Cohn said…

Hi Michael—
I’m glad to read we agree after all. I have edited the original post (something I usually avoid) to read that “each is expected to take the same amount of time” (about the surgery & stamps) rather than each “will.” You are right of course that the actual time is known only in hindsight.

As for the lab set up, etc for the surgery. Sure, there are many others involved but consider this example: Suppose a development team decides to put a number of points on a story of “hire a new programmer.” In considering that, they should only include the effort that will be expended by team members; they do not consider the effort of the non-team-member Human Resources staff. If we included HR effort in the estimate, we’d put too many points on it, overstate velocity and lead to later problems when the team’s velocity wasn’t benefited by the effort of HR members. So, in the surgery example, the surgeon was the only team member involved in that work.

I’m glad you’ve enjoyed the discussion on this one. I had no idea it would invoke such differing and strongly held opinions.

Mike Cohn said…

Hi John Mc—
Good question. In the case you described I would not re-estimate. The issue isn’t “Hmm, we’ve gotten better, let’s re-estimate. That story used to be a 5, not I want to call it a 4.9.” (Obviously an exaggeration.) In the case you describe the velocity would increase a bit as the team got better.

Now, if you have 10-year-old backlog items you may want to re-estimate those to reflect a team’s improved productivity but my first recommendation there would be not have a backlog anywhere near that long. I typically recommend a backlog of no more than 100-150 items which would likely be 10-20 sprints, and then perhaps no more than 40 weeks. And that’s a maximum. I’d like the product backlog to reflect no more than about 3 months plus some big huge epics giving a wild view of what’s after that.

To your other question, I can’t recall a time where we’ve re-estimated the entire product backlog. (Except perhaps being 10 items into the initial estimates and deciding to re-baseline because perhaps our numbers were too tightly grouped.) But, for more on re-estimating including when I do recommend it, see http://blog.mountaingoatsoftware.com/to-re-estimate-or-not-that-is-the-question

Michael said…

Mike,

Wow, I’m honored! I think that change is helpful.

Your last answer (about the surgery support staff) made me think of a related question: Do you consider external forces in terms of how they affect effort? 

For example: suppose you are working with an external client/PO. You have a new feature that you know has to get input from some department, AND you are going to need some data sets from the sales group. You know from previous experience that this kind of story requires extra effort—communicating, verifying and processing all these externals.

Ideally, you have no dependencies, and to a large extent your ScrumMaster should be removing these impediments before they affect workflow. But still, you need to estimate weeks in advance.

Do you increase your estimate in this case? Tell the client the story isn’t ready for estimation until you have all the data and input you need?

Thanks

Mike DePaoli said…

Great Post Mike. It certainly go some synapses firing 😊

I completely agree that at the end of the day all the custom / stakeholders care about his how long it is going to take because this largely determines cost and also dependencies for the broader organization.

One broader point about relative point estimation that I don’t think we should lose sight of.  Regardless of what unit one ultimately wants to arrive at, iT is valuable because of how the human intuitive algorithm works.  We are much more capable of processing patterns of information and bring to bear more information when doing exercises like relative point estimation.

In software development it provides an excellent way for the work experience patterns of a cross-functional team to be intergrated for the purposes of estimation. 

So, bottom line, relative point estimation is useful to us as humans just because of the way our brains work. You mind as well put the full power of your braint to work 😊 

I only mention this because some post on this thread smacked of the thinking ‘so what, story point estimation isn’t useful because it’s the same thing as estimating in hours?’.

Mike Cohn said…

Hi Michael—
I’m glad that change helps us agree on this.

As for the situation where part of doing a product backlog item requires the team to do something like nag some other team (“remember, we need such-and-such by Monday”) and if that nagging or other effort (e.g., going to meetings to persuade them to do it at all) then I would include that effort in the estimate of the backlog item. You didn’t ask, but also when sprint planning came along we’ll create tasks like “nag IT group to get new server configured, 2 hours.”

But, in cases when you don’t need to start the story yet and it can be safely deferred another sprint, I might very well tell the product owner that we can’t estimate the work until some other group finishes their work or believably commits a date to us. I also tend to take that approach when the other group has a poor record of meeting their own commitments to us.

Mike Cohn said…

Hi Mike—
Yes, you are absolutely right that there are some tremendous advantages to story points because of how it forces us to think about the estimates. Perhaps I need to blog on that pretty soon, too. Thanks.

Russell said…

As Mike describes in his book “a fundamental and common challenge in many organizations adopting and adapting agile and lean product development is estimates and commitments are considered equivalent; in other words, one assumes that the commitment to cost, schedule, scope and quality has to be the same as the estimate; it doesn’t.”

What should happen is we derive an estimate and, on the basis of that estimate, make a commitment to develop and deliver a story or defined functionality at a specific level of quality, by a certain point-in-time and at a specific cost.

Estimates using a relative unit-of-measure, for cost, schedule, scope and quality are derived from a prioritized and sized Product Backlog during Release Planning.

When you are estimating story size, at the Product Backlog level, a story should contain just enough detail for the team to be able to estimate its relative size to other stories based on certain criteria such as presented in this blog.

My initial set of criteria starts with the following but is adjusted based on the reality of the situation:
1. Complexity
2. Uncertainty
3. Knowledge and experience with domain
4. Knowledge and experience with technology

Commitments per story using a unit-of-measure in hours for what the solution involves or what it will take to deliver the story are derived during each Iteration/Sprint Planning session.

When committing to getting stories done during Sprint Planning, a story should contain enough detail for the team to be able to determine what the solution involves and extrapolate what it will take them to deliver the story in hours based on having tasked out the level-of-effort required to get the stories done.

Mike Cohn said…

Dear “Complexity Points Better Off”—
For information on how to create a common baseline for multiple teams, see http://blog.mountaingoatsoftware.com/establishing-a-common-baseline-for-story-points

I’m not totally convinced, though, that you are estimating complexity. Your last paragraph fits exactly with what I’ve been trying to say throughout this post. Story points are an estimate of how long something will take. Complexity factors into that but is not the SOLE determinant of the number of story points. When someone re-labels “story points” to “complexity points” they turn the measure into a measure of one thing—complexity. That’s what’s wrong. It is more than complexity.

Vivek said…

Interesting!!!

I have gone though the valuable inputs on this blog. Very interesting.

My points:-

1) If we are using effort to enable planning & tracking, what do we do with Story Points? Why story points are required?
2) If complexity is helping to arriving at better effort estimate, why do we resist using Complexity Point as one of the size measure?
3) If the team skills are changing & team members are changing, sprint capacity (man Hrs) are changing sprint by sprint (which I believe is realistic scenario for countries managing outsourcing - @ 20% attrition rate) how do I know the team/project performance?

Russell said…

I echo Mike’s point that “complexity” should be one factor of the relative “unit-of-measure” one uses when estimating; keeping in mind the distinction between and estimating and committing.

For example the following factors:
1. Complexity
2. Uncertainty
3. Lack of knowledge and experience with the domain
4. Lack of knowledge and experience with the technology

It is in the perview of you, your team and your organization to decide what factors you are going to use to derive your points per story using a relative unit-of-measure; keeping in mind complexity is just one factor.

Mike Cohn said…

Hi “Complexity Points Better Off”—
I don’t disagree at all that there is a statistical correlation between complexity and effort. However complexity is not the only factor determining effort.

Mike Cohn said…

Thanks, Russell.

Mike Cohn said…

Hi Vivek—
Responding to your 3 points above:
1) I will need to write another blog post sometime about why I think story points are far better than estimating directly in ideal time or any time-based estimate. As noted in some of my replies above, some reasons are given in the Agile Estimating and Planning book but I’ve come across even more reasons to prefer story points.
2) Complexity is an element in determining total effort. I don’t know why though you’d want to separate out a different unit called “Complexity Points.” Stick with one unit called Story Points (or something similarly innocuous) and use it as effort as adjust by things like complexity, uncertainty, risk, etc…
3) See my other blog post on what to do when team size changes. You can track that and make predictions if you track the historical data on relative changes in velocity when teams change size.

Russell said…

There are two primary reasons we estimate story size using a relative unit-of-measure at the “Product Backlog” level:
* the teams understanding of each story

* a high-level cost, schedule and common understanding of scope

As for playing planning poker, practicle experience tells me we do not do this for the entire Product Backlog for a myriad of pragmatic reasons.

What works for me is to have the team size each story as XXL, XL, L, M and S. Then the stories that are medium play planning poker for each of these. I assign XXL= 55 points, XL=34 point, L=21 points, M=5, 8, or 13 (we play planning poker to determine if the story is an 5, 8, or 13) and S=2.

When it is time to plan our iteration this is when we use a more precise unit-of-measure as the team commits to what it will take to get each story done.

Rahil Patel said…

Like many folks who have responded, I was a bit baffled at first after reading your post. 
In one of your responses, I’m glad that you wrote down SP = f(E, R, U) - it does bring things a lot closer to what I undertand points to be.
You also mention in your post right above that complexity is part of effort, this also makes things a whole lot clearer.

So then would effort be a funtion of time and complexity rather than just one?
Also, how does one deal with temporary learning curves that go away after a while from a points standpoint?

One more question - do you think of Story points as a measure of “Doneness”, rather than just a measure of effort?  In other words, SP is a results oriented metric, because you don’t earn SPs till a story closes, ideally delivering business value with the closure.  A team could spend significant effort toiling away at a story, but points are not earned till all the acceptance criteria are met.  Thoughts?

Daniel K said…

Hi Mike

“One of the most compelling reasons to prefer story points even though it is another way of estimating effort is because it allows individuals of different skills to discuss the relative size of work. My favorite example is of two runners at the start of a trail–one says it will take 5 minutes to run, the other says 10 minutes. Both are right. Those are the amounts of time it will take each to run. There is no time-based argument they can have to settle on the “right number.” However, both can agree that some other trail is twice as long—one runner will be thinking “Yep, 20 minutes” and the other will be thinking “Yep, 10 minutes.” But both can agree “twice as long.””

I think this reasoning breaks badly if applied to your brain surgery/stamp licking example. The kid can presumably lick 1000 stamps in about an hour, while it will take him ~10 years to learn brain surgery. So for him the latter task takes ~10^6 times longer to complete. For the brain surgeon the tasks take approximately the same time to complete. Clearly these tasks are not comparable in the same way the running tasks are.

The “comparability” of software development tasks also varies (although seldom past any of these two extremes). Comparing the size of tasks involve assumptions, for example about who will do it. Often the most direct route to compare the size of two tasks is to make the necessary assumptions and reduce them both to expected time +- expected deviation. But in that case we have to do two absolute estimates to arrive at the relative one, so what does relative estimation buy us?

Mike Cohn said…

Daniel—
One of the points I made in a comment here somewhere was that the example breaks down here because it is of tasks rather than of something more close to a user story. A typical user story or product backlog item will involve work from 3 or more team members. The kid/surgeon example here is simplified to be of tasks. For a user story the impact you describe is mitigated by having 3+ people involved in the work. (That is, it’s unlikely that the 3 working on one story are all 10x better than the 3 working on a different one.) So individual performance differences wash out (some, not entirely).

Additionally, I’ve made the point above that we should assume that the right person will do the work. That doesn’t mean that the programmer who is 5% better than the other programmers needs to do all coding, but in the kid/surgeon case, I suspect that sometime during the kid’s 10 years of med school, the surgeon will find the necessary half hour to do the surgery, making the kid’s education all for naught ;

Mike Cohn said…

Hi Rahil—
Yes, story points are an estimate of how long we think something will take. How long we think it will take is a function of the effort involved, how uncertain we are about our estimate of that effort, how much risk is there, etc. Some of those are affected by complexity. A highly complex task is likely to have more risk and likely to have a higher number of story points. However, the complexity itself does not add to the number of story points. It should only do so if that complexity affects or might affect the amount of time the work will take.

As for learning curves, the team should always estimate given their current state of knowledge, tools, etc.

And yes, points are earned only when a story is done. No partial credit is given.

Baartz said…

Nice article.  Any thoughts on the relationship between backlog order and effort?  It appears to be a recurring theme that the completion of backlog A reduces the presumed effort in backlog B.  Since effort estimate are captured early, and prioritization can occur late, how does one account for this?

Mike Cohn said…

Hi Robz—
Nice summary of our discussion.

Mike Cohn said…

Hi Baartz—
I’ll have to create a separate blog posting on that. But it’s a good idea so I’ll see how soon I can get to it. I’m coming off a two-week holiday (en route home right now) so I’m quite backed up on things currently. I’ve added this topic to the list though.

Marek Blotny said…

Hi Mike, concept of story points was always slightly confusing for me but as I can see not only for me! I have publish a post on my blog about story points and what they are for me inspired by your post.

I have proposed following mapping between SP and time:

1 SP = less then 4h
2 SP = 4 hours to 1 day
3 SP = 1 to 2 days
5 SP = 2 to 4 days
8 SP = 4 to 8 days

Of course each team can re-define this mapping for their own purpose but do you think that it can be used as a starting point?

Mike Cohn said…

Hi Marek—
If you do a mapping like this you completely defeat the purpose of story points, which is to allow individuals of different skills to talk about the work. If you are twice as good a programmer as I am, you and I will never come to an agreement using your scheme. You could do some piece of work in 4 hours but it would take me 8.  You and I could agree though that this piece is work is half the size of some other piece of work. For more on mapping points to hours see my earlier blog posting at http://blog.mountaingoatsoftware.com/how-do-story-points-relate-to-hours

Marek Blotny said…

Hi Mike, thanks for you reply, I see your point but I still have my doubts:

1) everything slightly depend on definition of ‘done’ for user stories. Usually coding is just a part of all activities required to finish story. Therefore I don’t think such a big differences between individuals are very likely.

2) my idea is to find a reference story and within a team agree that given story is worth lets say 2 points. And from that point compare (in terms of size) other stories to the reference one. Of course one may say that “individuals of different skills” are unable to agree in terms of time. Important is to understand that we should talk about the team rather then individuals. It’s hard to assume that story X will be implemented by individual Y. Estimates should take into account average skill within the team. Therefore I think that there are 2 options to agree while estimating: simply take the highest estimate or ask these individuals to estimate how much time they need to finish given story working *together* on it.

Mike Cohn said…

Hi Marek—
You’ll see in the dozens of comments from me on this post that I agree with your two points: the things we estimate in story points are product backlog items and these are typically done by multiple people so that individual performance differences (which are often much larger than 2-1) are reduced somewhat.  Second, I agree with finding a “reference user story” and calling it a 2 and then estimating relatively from there. That can’t happen though if individuals are each mapping points to hours such as 1 point = 4 hours. Instead two points is whatever that one story is and then 4 points is something we each think will take twice as long.

Dan Bergh Johnsson said…

Hi Mike

I like the example of “licking thousand stamps” as it really clarifies the distinction between complexity and effort - and it is no doubt the latter that is useful to estimate and track, as it enables us prediction; the former is not useful in the same way.

However, the “licking thousand stamps” is also a good example to show how complexity and effort are related. And how, in software development, the former drives the latter.

When licking ten, hundred or thousand stamps, it is the bulk of the work that drives the effort. But if you are to lick a million stamps, then you will probably build a stamp-licking machine. And then it is the complexity that is the major driver of effort.

In my experience, software development efforts are very often about handling complexity - essential or accidental. Thus it is more akin to the machine-building that to the manual stamp-licking. Of course, complexity that does not add to our effort is uninteresting (as you have pointed out).

So, IMHE we want to estimate effort, but complexity is the major driver behind that effort.

I elaborated this a little at http://bit.ly/cCQLzr.

How does this compare to your experience? What other drivers of effort have you seen in the wild?

Mike Cohn said…

Hi Dan—
Good points. And yes, I completely agree that complexity is a major driver of effort in our industry. In one of the many comments here I listed others such as uncertainty, risk, and sheer volume of the work (1000 vs 10 stamps).

JOOLz said…

Hi Mike

I think there is a common problem here with story points which is that in my experience there are always some who want to define exactly what a story point actually is. In doing so they try to equate it to some other measurement that they are familiar with such as time.  This is understandable.

But we mustn’t forget how story points assist the team by remaining an abstract measurement of relative size.  It alleviates the necessity for long discussions about how the story is to be delivered, how complex it is and of how much time it will take to implement i.e. too much detailed discussion.  Discussions to this level of detail generally result in differing views within the team about the amount of effort required and therefore a protracted, labour intensive and perhaps tedious planning meeting.

Keeping story points as an abstract but relative measure of effort means that teams get to estimate a product’s backlog more quickly.  As soon as you have your first story estimated it becomes easier for the team to estimate the subsequent stories.  In my experience, even if the estimate for a story differs wildly between a tester and a developer, following a brief discussion as to why they differ so greatly, the team will eventually come to a consensus of how much effort is required.  This resulting from a discussion where the amount of time it will take is never mentioned.

Also, after a number of sprints it’s my experience that the team’s ability to estimate the size of stories settles down i.e. the individual estimates from each team member start hitting the same values from the first reveal of the planning poker cards for a given story.

I guess what I’m getting at is that experience speaks volumes.  And experience shows that story points do work even though it might appear they can’t or shouldn’t.  I would encourage anyone sceptical about using story points to try it first and give it a fair go too.  Tried over 4 or 5 sprints, I’m sure they will begin to see how the team will come to value it, seeing how it has improved their ability to estimate and plan a product delivery more quickly and effectively and with more accuracy.

Mike Cohn said…

Thanks for your comments, JOOLz. My experience matches yours. I openly acknowledge how uncomfortable story points can seem before trying them. Once used to the idea, they work well for most teams, though.

John Esser said…

Story points, long understood in the Agile community, are a measure of SIZE not effort, nor complexity, nor duration.  More specifically, story points are an expression of relative size. One major problem is that effort is generally thought of in terms of time units (e.g. man-days) so this causes a lot of confusion.  Teams need to be broken out of this way of thinking to use story points effectively. I use the analogy of a a house—a 1000 sq. ft. home is just that.  How long does it take to build it? It depends on who is doing the work, their experience level, and other factors. In software we look at things like how much code we’ll have to right, how many pages we’ll have to change, how many tests we need to create, and how much code we’ll need to refactor.  The key is having stable, reference stories from which to estimate.

In the house example, if I have junior carpenters the elapsed time may be 6 months, but if I have master framers maybe it takes 4.  Because I have used size (rather than effort, complexity) I can see the effect master works had on the elapsed time.

The advice of Jeff Sutherland is very salient on the issue of story points.  To quote: “relating hours to story points causes a huge impediment for the team. If the team in generating continuous process improvement, the hours per story point will be continually decreasing. Assuming a number of hours per story point makes it impossible for the team to show they have improved and fixes in their mind that there is a reasonable number of hours per story point.”

Also from Jeff, “My venture companies typically have two stable references stories for 1,2,3,5,8,13,20 points. It is easy to see where stories fit. The Product Owner can help keep people honest on estimation…The problem for our teams is not inflating estimates. As they go faster they estimate similar stories to have fewer points unless they are very careful to have stable reference stories. For some of them we know they are going twice as fast and velocity has not change. This is an impediment because they cannot tell when a process improvement has helped.”

Oaz said…

The example “lick 1000 stamps” is misleading because you never lick 1000 stamps in software development. You often lick 1 stamp. Occasionally you could lick a couple of stamps. After that you implement a StampLicker that does the job for all the remaining stamps with zero effort for the members of the team.

Though, the complexity of the StampLicker implementation has to be considered.

John Clifford said…

Hello Mike,

I have to question the premise of this article, because equating effort to time is, in my opinion, a huge mistake. They are related, but not equivalent, and equating the two violates a basic rule of estimation: estimate size, derive duration. What ties effort and duration together is velocity, or the rate at which the effort is exerted to implement the desired functionality. Yes, managers want to know duration rather than effort, but we are not helping them or ourselves by confusing or equating the two.

Instead of equating effort to time, it should be considered as analogous to distance. For example, Mike, and Tobias, and I, and everyone else reading this thread can generally agree on what a mile is, and if one distance is twice as far as another, or half as far. However, we will disagree on how long it takes us to cover that mile because that is dependent on our individual abilities. I’ve met Tobias, and I think he could run a mile faster than I currently can (wait until after arthroscopy!). Similarly, a world-class miler could would accomplish the same amount of work more quickly than either of us. What’s the difference? Velocity.

Looking at the licking-stamps-versus-brain-surgery example, I would disagree that they should have the same measurement in story points. Brain surgery requires far more skill, and I think both a brain surgeon and a paper boy would agree that the effort required to lick a thousand stamps is far less than the effort required for even simple brain surgery… especially considering the differing Acceptance Criteria. That the brain surgeon can perform the surgery in the same amount of time as the teenager can lick and stick 1,000 stamps is due to the fact that his velocity for brain surgery far exceeds the teenager’s velocity for brain surgery even if their velocity for stamp licking is similar. Similarly, two different Scrum teams may agree on the same story point estimates for two different stories, but one team may take far longer to implement those stories due to their lower velocity (ability to turn ideas into potentially shippable increments of software functionality). Of course, how people do what they do also affects velocity. I can write the same functionality in assembler as I can in C#, but my velocity will be much higher in the high-level language.

Accordingly, I don’t really care about duration when I’m estimating because time-based estimates are closely tied to the person doing the estimate; I may answer “10 minutes” when asked how long it would take me to cover a mile while Carl Lewis may reply “5 minutes” and we are both right (if we have to run that mile in the time given) and both wrong (if we are estimating for the other person).

In short, story points are best used as a relative measure of effort/complexity/size instead of duration, and then a valid duration can be derived by factoring in the velocity of the people who will actually do the work.

tarandeep said…

Mike, i read your agile estimating and planning book and it says, story points estimate relative size and then duration is derived from it by using velocity !! i am confused when i read this post of yours ! is this not contradictory to finding relative size and deriving duration. Is this not same as traditional estimating where we derive duration directly ?

Mike Cohn said…

Hi Taran—
I don’t find this post contradictory to my Agile Estimating and Planning book at all. Consider this quote from Chapter 4 of that book: “There is no set formula for defining the size of a story. Rather, a story-point estimate is an amalgamation of the amount of effort involved in developing the feature, the complexity of developing it, the risk inherent in it, and so on.”

I wrote this post because too many people were misunderstanding relative estimating in story points. They were interpreting it to be that we estimate solely the complexity of a user story. That’s wrong. Complexity is one factor and often perhaps a key one in software development. The point here is that we are still estimating how long something will take—-doing this will take twice as long as doing that. But we are doing that in relative rather than absolute units.

Mike Cohn said…

John—
I believe we are saying the same thing. See all my replies in which I comment that what we are estimating is effort which is a function of how long something will take as adjusted by all sorts of things such as complexity, uncertainty, risk, etc. So I am not equating the two.

The point of this post was that story points are not “complexity points” as I had been hearing people rename them. Measuring nothing but complexity does not help us answer the questions asked by clients, customers, or bosses.

Simon Cockayne said…

Hi Mike et al,

What a super thread!

Ok, I am clear on the point of your post that “story points are not complexity points”.

I think what is slightly confusing though, is that on one hand you say “a story-point estimate is an amalgamation of the amount of effort involved in developing the feature, the complexity of developing it, the risk inherent in it, and so on.”, but in at least one place in the book you say, I think, a story point estimate is, paraphrased “a measure of pure size”.

Do you feel those two statements are in conflict or are you saying that Size = f(Effort, Complexity, Risk,...)?

Moreover, do you equate Effort to Duration?

Cheers,

Simon

Mike Cohn said…

Hi Simon—
I’m glad you get that story points are not “complexity points.”

I am saying Story Points = f(Effort, Complexity, Risk, etc.)

Effort is not equal to duration. I stick with the standard PMBOK definitions of those rather than try to redefine things.

Jose said…

Imagine that we have a team of painters and their job is to paint walls. The team see pictures of the walls and they estimate the wall area to be painted. There are small walls, medium walls and some really big walls. The team estimate each area using relative points. They start painting the walls and after some time, they are able to calculate how many points they can do on each sprint, the velocity. Now the customer can calculate how long the project will take, using the velocity and the product backlog.

We are able to predict the duration of the project without using any “time” in our estimation, because the duration is based on the amount of work done per sprint. No need to use time in our estimation.

I always thought this was how story point estimation were done.

Mike Cohn said…

Hi Jose—
This is how story point estimation is done. Your example is a great one. It’s one I’ve been using in classes for a couple of years now. But let’s think about what your hypothetical team thought about when creating the estimates.

Let’s keep it simple (at first): no furniture in the room, no crown molding, etc. In your example the team had some really big walls and some small walls. They sure didn’t estimate based solely on complexity. If they did both walls would have the same number.

Now let’s add some windows and window sills on the small wall. Add enough that our hypothetical team looks at a big wall and this small-wall-with-windows and wants to put the same number of story points on the two walls. If story points were merely “complexity points” they would have put a higher number on the small wall (since it’s more involved to paint).

What’s going on here is that the team is estimating in the only thing that matters: how long the work will take as adjusted by factors like uncertainty, complexity, etc. This is why the large wall and the small wall with windows get the same value. And intuitively that makes sense because these walls will take the same amount of time to paint.

Some at this point will say “Then why use points at all?” I’ll share just a couple of reasons here because that topic deserves its one whole post someday (and it’s also covered in the Agile Estimating and Planning book):


- Suppose the painting will be done by a professional house painter and his 8-year-old kid. They can’t have a discussion directly around how long it will take. Dad says “This room will take me 4 hours” and the kid says, “No way, more like 4 days.” And both are right. Then can discuss the room relatively: “Hey junior: This room is smaller but will require more taping around windows so I think it will take just as long as the big, easy room we looked at earlier.” So points let people with different skill sets discuss the estimate meaningfully.


- Suppose that you’ve been making these wall painting estimates from photos I’m showing you of the walls. There’s nothing else in each photo but the wall. You come up with all your story point estimates and then I spring on you that I’ve been showing you pictures of Bill Gates’ house. The wall you thought was 15’ by 10’ is really 450’ x 300’. If that were true of all the photos I showed you, your point estimates would still be valid because all the estimates were relative. Your velocity would be much lower than you’d anticipated, of course, but the estimates would still be valid. If those estimates had been done in a non-relative manner such as perhaps with ideal time, you would have to redo the estimates.

So, Jose, your example was a great one and your understanding of story points is perfect. What’s happening though is that your mythical painting team is thinking about how long each wall will take when they estimate it.

Jose said…

Hi Mike, thanks!

These two examples are wonderful!! I can really understand what you are saying about using time in estimations and the examples show how to think in relative manner.

But I should insist that maybe we can estimate without using time and achieve the same results for the customers.

Let’s go back to the painting team example. You said that the painting team were thinking about how long each wall will take when they estimated it. But that’s not really true. They were told not think on how *long* would take to paint the wall, but to think only about the wall area: they must *measure the area* in relative points, the size of the wall. Well, after some sprints they know how much points they can do on each sprint, so now they are able to predict the duration of the project. The team wouldn’t have to think on how long it takes to paint a wall until they actually start doing it.

Another example. Forget about story points, let’s measure the wall in square meters. Imagine that the team have measured the *area* (and only the area) of all the walls in the backlog. They start painting the walls and after 2 weeks they found they have done 3 walls: one with 100 square meters, the second with 50 m2, and the third one also with 50 m2. So they now know that they can paint 200 square meters per sprint. Supose that the backlog has 2000 m2, and so I can predict the duration of the entire product: 10 sprints.

My whole point is that we can estimate thing’s attributes that are not time or time-related and even so we are able to predict duration using these estimates, if the attributes are directly proportional to time. Wouldn’t that be true?

I think that wall area is directly proportional to time to paint, so I can estimate areas and derive duration using area painted per sprint. Would that be true for software complexity? Is software complexity directly proportional to time?

Mike Cohn said…

Hi Jose—
When I teach my Agile Estimating and Planning class, one slide comes up that I stress is one of the most important in the class. It says “Estimate size, derive duration.” It’s about doing these in separate steps. We estimate size by saying “this work is such-and-such big.” We derive duration by dividing size by velocity. We can of course get fancier—working with ranges, confidence intervals etc. But that’s the general idea.

When you suggest measuring the square meters of the wall, you are using that as a measure of size. So in your approach you would be “estimating size, deriving duration,” which is exactly my goal.

This works because you picked a metric that will closely correspond with the duration of the work. But I’ll contend it could be quite wrong because some walls are dark colors now but we want to paint them white. Other walls have fancy molding on them or doorways or windows cut in them. Other walls held our taxidermy collection and we need to patch those big holes before painting. So your metric is one that would have a high correlation with total time spent painting but it wouldn’t be hard to factor in other things and come up with a better estimate. In factoring in those other things (presence of molding, paint color, holes to patch, etc.) the common element is the effort involved. I can’t add “50 square meters plus 3 big holes + crown molding.” The only common denominator for them is effort:  “Hmm, this 50 square meter wall with it’s 3 big holes and crown molding is probably about the same effort to paint as that 100 square meter wall with no extra factors.”

Alternatively, we could create a parametric estimation model by analyzing a lot of data and coming up with total paint time = square_meters * 1.5 + holes * 8 + linear meters of crown molding^2 * 7.2 = total paint time. COCOMO is the most famous parametric estimation model but there are others. These have never really caught on in practice though.

Jose said…

Hello Mike!

I understand what you’re saying, my example was really too simple. But, imagine that I could find such a function with parametes for every wall caracteristics that affects estimating (probably there isn’t too many parameters for walls - at least I could create/adapt a function for each painting project). Then I could use it to derive durations, without using time in estimations, right? I’m not advocating doing this in software, just to be clear that it’s possible to do estimations and derive duration without using time in estimations. But in software this is much harder to do. How to find such a function? COCOMO could be a way but it looks like too costly and with many constraints. It’s better to stick with expert judgment.

But, for my surprise, Boris Gogler posted yesterday an entry in his blog with the same subject (http://borisgloger.com/en/2010/10/12/agile-estimation-basics/), but with a thesis opposed to this post. He says that he “never estimate efforts, because that does not work”. To estimate a story one should “simply try to define the ‘dimensions’ of this story, which have nothing to do with the implementation of this story”.

Actually I’m a bit confused now. You two are very respected gurus in Scrum and Agile community, but it looks like have opposite opinions on this very important and difficult subject.

By the way, all the comments in your Blog are so good that you could create an entire book just with them. Thanks a lot for all your time and attention to us.

Mike Cohn said…

Hi Jose—
I have not read that other blog so I cannot comment on it. Yes, you could create a parametric estimation model but you say yourself you don’t want to because of the effort involved. That and the need for local calibration have been the challenges of otherwise great approaches like COCOMO.

Jose said…

Hi Mike,

I thought you could comment on another blog, not aiming to criticize, but to bring more wealth to the debate, more knowledge for the community.

It would be very interesting and helpful to us if you could give us your opinion about those ideas.

Freyr said…

If I understand the conversations above correctly, I see two sides here:

1. Improvements in the team are reflected in the fact that individual story estimates are gradually reduced.  So a story last year which was a 5, now has some process improvements and this year is a 2.  Velocity stays steady, story points per story decrease.

2. Improvements in the team are reflected in velocity.  So the story of 5 since last year is still a 5, but now we can do more such stories in one sprint.  Velocity increases, story points per story stay steady.


In #1, the only way for me to see improvement trends is to monitor stories which are comparable.  I can see that we are better now than before by comparing two similar stories done at two different times.  Velocity tells me nothing.  Additionally, after every sprint, stories which have already been estimated are potentially immediately out of date due to process improvements.
In #2, I can simply monitor velocity.

I like thinking in terms of physical distance.  I draw an analogy between the following two questions:

1. What is the story point value of this story?
2. What is the distance in meters from point a to point b?

If I were to ask 5 different people “How long will it take you to get from point a to point b?” I would get 5 different responses, a lot of discussion, and no valuable result.  The value I see is in the relation between distances.  If the 5 people all agree that the distance from point a to b is “1”, then when I ask them to estimate the distance between point c and d, they can do so pretty accurately, even if their premises are completely different.  This distance never changes; it is a constant.  So even if the premise of the estimate changes (e.g. you can now use a car to cover that distance) the actual distance is the same, and you measure progress by realizing that you can cover more of this distance in the same time than before your process improvement(car).

If I can cover 3 times the distance from a to b in one sprint by walking, then my velocity is 3.  If my process improves by introducing a car, then I can cover 30 times the distance between a and b, shouldn’t my velocity be 30?  Why would I take the individual story (get from a to b) and reduce its value from 1 to 0.1?

Nirav Assar said…

Mike, the concept of this post does not make sense. If story point are really about effort and effort is time, then what is the point of having some meaningless conversion. Why don’t you just use time as your estimate? When we say velocity in physics, it is d/t (distance over time). Story points is the equivalent of distance is software terms, and a story point is complexity. Velocity in agile is basically how much complexity can you tackle per sprint (iteration). If story points are just effort, 3 programmers on a regular work week can only cover 120 hrs of work. Your velocity never increase.

Thus your points makes no sense. I think you are way off on this one.

Mike Cohn said…

Hi Nirav-
I’m sorry the post made no sense to you. I stand by its points though (pun not intended there). This post was clarifying how people misuse points and think of them solely in complexity, which is a wrong use. I didn’t attempt in this post to cover the advantages of points. But that did come up in at least one of my many replies to the comments on the blog. For example, see my reply on October 10, 2010 at 12:54 pm above.

As for your point about velocity in physics: You are absolutely right. I hate that we call this velocity but that is a legacy term from the original XP teams. I prefer to call it “pace” and when I teach these concepts in classes I do mix that term in along with the standard “velocity.” But, I don’t like fighting lexicons so I use the fairly standard terms such as velocity instead of pace and “product backlog” instead of what my earliest Scrum teams called the “prioritized features list (PFL)”, which I also think is a better term.

Veselin said…

Mike, I have one big problem with story points, not sure you are aware of this. I see many teams new to agile doing estimation of every single story, and I mean hundreds of them (in one team close to 1000) prior to the project start. Reason being that teams are asked how long is going to take them to finish the project. So obvious answer is, let’s estimate in story points every little thing, we know the velocity and voila, magic happens.
What happens in reality: project stalled for weeks, and only thing I hear is poker games and story points. Actually I had even one team complaining that they are slowed down because now they do agile stuff 😊
BTW, I do know what Craig Larman says about it (and discussion about workshops and finding the right balance when to start project) but i want to hear your opinion about it too. I wonder whether other people have seen the same?
Thanks

Mike Cohn said…

Hi Veselin—
I don’t see how that is specifically a problem with story points. Those same teams—in their pre-agile days—would have spent the same multiple weeks doing a big task decomposition to try to appease some boss with a “perfect” estimate.

And, yes it is critical to find the right balance between anticipating and adapting to changes. I have a post specifically on that topic at http://blog.mountaingoatsoftware.com/balancing-anticipation-and-adaptation

Arran Hartgroves said…

I find the complexity story pointing approach commonly, but I believe teams often sub-consciously relate this to effort when assigning relative values so hopefully this misconception hasn’t caused too many estimation issues. I feel sorry for the guy that did brain surgery 1 sprint (based on complexity), to then take on licking of stamps the next sprint! Great analogy by the way.

I suspect the complexity approach is widespread, has this been passed on through some common syllabus or book, or is this just a common misconception?

Angry Ashley said…

Personally, I think this article is just plain wrong. There is a difference between complexity and a different outcome in mind. In my view complexity should only ever be used to give an indication of duration. The velocity planning is designed to give a measure of duration when team size, particular BA/Dev/QA skills are all taken into account. This is different from complexity for a reason – the relative complexity of one story next to another will change less often. If you get two iterations into a release and think your velocity is slow so you add another dev, do you go through and re-estimate all the effort, or, if you have a relative complexity, do you just apply a shift in the velocity across all points? In most Agile environments the near-enough reapplication based on relative complexity and velocity is the right measure. If your project is time critical or particularly constrained then you may go to the extra effort of recalculating effort but rarely is effort actually estimated for anything but the next few stories. If you need effort to be calculated for more than this then you are losing some of the benefit of being “just-in-time”. Whether or not this is fine depends on the risk profile of the project.

Mike Cohn said…

Hi Angry Ashley—
I believe you have missed a couple of key points in what I’ve written, which is quite understandable as some came out only in the lengthy discussion that followed the initial post. You are correct that “velocity planning [I’d call this “release planning”] is designed to give a measure of of duration when team size [is] taken into account.” Yes—that’s duration. What this post is about is that story points are an estimate of *effort* and that duration is derived by dividing effort by velocity. Since velocity will change (as you allude) as team size or skills change that will change the duration produced by that division. But the size of the team does not change the effort involved. Effort is independent of team size, skill, etc.  So your comment about “you may go to the extra effort of recalculating effort….” is wrong. Effort will be the same. This would be a recalculation of duration.

Enric L. said…

Mike, based on what you say in this blog, does the sequence of estimating user stories affect how those are estimated?
For example: If I have 10 user stories that are exactly the same in nature, lets say 2 story points each, but as soon as I finish the first one, the second one will become much easier and faster to complete. Should we then assign 1 story point to the second one, third one, and so on? or all should remain 2 story points regardless.

Mike Cohn said…

Hi Enric—
Here’s the way to reason through to an answer on a whole class of questions just like the one you pose: The sum of the estimates on the product backlog items should always represent the true overall size (as estimated) of the product. With that in mind, if we estimated all the items you describe as 2 points each we would overstate the size of the product to be built. We’d call it 20 points when the real size is presumably 2 + 9x1 = 11. So, the right answer is to put the lower number on 9 of the 10 stories, put 2 points on one of the stories. And if it’s somewhat undetermined which of the 10 will be done first consider putting notes on the stories like “Assumes this is done after story x.”

Freyr said…

Ok, so process improvements (or refactorings, etc) can reduce the effort of a subset of subsequent stories.  What does that leave for reasons for potential velocity increase?  Excluding the process improvements which would affect every single type of story (where it would be easier to just leave the stories as they are and watch velocity grow), does that only leave things which are directly related to personnel?  Like adding more team members, training them, improving teamwork and communications skills, etc.?

I wonder about this because I do not want to re-estimate my entire future backlog every time we make a small process improvement (which for our teams is at least once every two weeks).  I wonder if it makes sense to catalog each improvement and attempt to identify if it directly affects a certain type of story, to get an overview of which stories would (potentially) need to be re-estimated.

Mike Cohn said…

Hi Freyr-

I don’t see why you would need to re-estimate your entire product backlog. I suspect you are trying to do things with velocity that are best not done. Although in concept you should be able to look at a team last year and this year and say something like “this team is 14.2% faster” that is unlikely to be the case. For that to be true a team would need to estimate this year’s user stories relative to last year’s user stories but do so with last year’s capabilities in mind. It’s hard enough estimating with our current capabilities in mind. An example:

Story A is done last year and was estimated at 5 point.
If Story B had been estimated last year at the same time everyone would have called it 5 points as well.
But Story B isn’t estimated last year, it’s estimated this year.
And the team has improved—more training, better at testing, read a few good Java books, etc. They estimate Story B and call it 4. They justify that estimate because “we can do Story B right now a little faster than we did old Story A.”

The team then starts a sprint into which they bring Story B (4 points) and lets say Story C (1 point). THey have a velocity today of 5 points. But in some sense we can say they should have a velocity of 6 points and that they are 20% faster.

Your argument will likely now be that since your team improves every sprint (congratulations) your team should re-estimate Story B if it had indeed been estimated a year ago. Not to do so would, for your team, overstate velocity. However, it should be pretty rare that a team is doing a year-old story. If they are, the story should have been an epic that was broken into smaller pieces and re-estimated. Further, the potential problem here is easily enough fixed by saying “We will re-estimate any story over a year old.” So you wouldn’t be re-estimating your entire product backlog, only ancient items that got left undone for a long time, of which there should be very, very few.

But to be clear, I don’t recommend doing any of this because I’m not wild about the idea of using velocity as a way of measuring team productivity improvments.

Freyr said…

Thanks for the reply.

We try to keep a forward-looking cone, so that e.g. the next 4 sprints are estimated and broken down to enough detail to pull, the next 4 slightly less detailed, etc.  What tends to happen is that stories get rearranged (partly due to other problems, e.g. not planning for iterations, but also due to legitimate reasons), which might mean that some stories might “circle” around sprint 3-5 for a while before finally reaching sprint 1.  This means that the estimates for these stories might be 3-6 months out of date.  When they do come closer (sprint 1 or 2) of course we re-estimate them and often rewrite them, but while they are circling, they have not yet reached our focus point of re-estimation.  Which means that they represent an increasing risk to planning.  I could add an “age” factor to each story, and “force” a re-estimation (or re-evaluation) every x months, but I am always wary of introducing more complexity to our backlog.

How would you deal with the risk factor such stories impose on project planning in light of constant improvement of teams?

My initial concern was indeed directly related to using velocity as a way of seeing (not necessarily measuring) that a team is improving.  How would you prefer to measure/show team improvements? 

I realize that my questions break the scope of this post, but I would be much obliged if you could point me in the right direction.

Thanks,
Freyr.

Mike Cohn said…

Hi Freyr—
It’s hard to propose a solution to your re-estimating problem without being there to delve much more into its specifics. In general I don’t re-estimate a lot. I want the product backlog items to all contain about the same amount of uncertainty, otherwise it messes with velocity and predictions you can make from it. I’ve written about re-estimating elsewhere on this site.

As for measuring productivity, the best approach has been that done by Michael Mah using Larry Putnam’s productivity index. If you search for Mah you’ll find some papers by him. There are also references to him on the http://www.SucceedingWithAgile.com site in the presentation in the Resources section.

Freyr said…

Thanks for the ultra-swift response!  I wasn’t expecting a reply to such an old post for a while.

I appreciate the nudge back on track.  Thanks.

Jake Gordon said…

I’m really late to the game but what a great discussion! I’m suprised it stirred up so much controversy. My takeaway is simply that when factoring complexity and uncertainty into your story point estimates one is ultimately just stating that a story is likely to take longer because it is complex or because there is probably additional work that is not yet understood. I have trouble imagining circumstances where complexity does not equate to time, at least indirectly.

This post also lead me to another excellent post that answered some other questions that were haunting me: http://blog.mountaingoatsoftware.com/to-re-estimate-or-not-that-is-the-question

Thanks for the insight!

Mike Cohn said…

Hi Jake—
I’m glad you’ve enjoyed our discussion here. The only thing I’d add on complexity & time is that complexity influences time and may sometimes be the biggest driver of the time that a story will take to develop but we are not directly estimating complexity because other things are involved (e.g., sometimes the sheer amount of work to be done).

Vijay said…

One possibly last attempt at eliciting from you what all the others have tried so many time before – as I am still not convinced that one size fits all – no pun intended. Well, SP = f(E,C,R) is the best we could get so far in line with our thinking - and in fact as you mentioned on one of the latest comments E = f(C, ...) - hence does not need separate mention. But why not a story-point reach out to a larger audience by saying S.P.  = f (V, C, R) where V is the Volume of work as perceived by various audience - read function points, feature function points, screen elements and actions, report groups, or whatever be that is related to the Work Product and not the work itself. So, would like to see SIZE as the work product / item SIZE and not the work SIZE. Of course we can adjust it to work size based on a technology and tools we use - usually a deferred decision based on lean principles and Murphy’s law.

Taking the example of the painting mentioned earlier - I would like the estimate for painting the wall to be initially same when I do story estimation irrespective of whether I manually paint it or whether I use a robot or machine. Usually is the target robot or machine friendly gets detected much later and we can of course account for it when we split to task estimates in hours or a pre-sprint story point adjustment.

Same holds good in IT examples as well where some features can be implemented using high-productivity tools and frameworks while others due to their nitty-gritty requirements may end up being built using more low-level technologies.

Do you recommend or at least empathise the need for a two stage estimation approach for this scenario. Also, would like to reiterate what others have said, the first estimate based on volume can be well understood by one and all - tester, BA, developer, user, etc.  - while the second level is usually a story of the elephant and the blind men - where complexity is purely from the perspective of difficulty of test automation or code generation or design or integration etc.

Some concrete examples where we have a degrading level of productivity in chosen technology - just to get the message across well - Sharepoint direct feature vs. .Net web part, direct informatics transformations vs. custom transformation, jasper reports vs. custom JavaScript vs. Flex Reports.

ibes said…

Hi Mike,

First of all I would to thank you for your time spent here clarifying all the people “situations”.

I would like to start by having in mind the example that you gave us at the beginning of the comments thread, related to the surgery guy and stamp guy. Let’s suppose that we are dealing here with one team, the surgery and the stamp guy, along with two stories the brain surgery and the stamps process.
Having the team start estimating, they should be agree upon each of one. The idea in what I’m interesting is how can the stamper guy, evaluate the surgery story? At the beginning, definitely, the surgery guy, must explain how are the implications, and most probably, the stamp guy cannot do this. In this case, we can say that the team is not formed well, and we need to change it. The differences between the team members are to big or even not compatible (very rare case anyway). The stamp guy will evaluate the brain surgery having a big SP number. And that is because not all the team have the same experience. What would be the win-win case in this scenarios?

But in other cases, we may have to deal with similar people ... and they must agree on the differences between brain surgery and arms surgery, for example.
Just to conclude, all these were presented because, at some point, any member of the team can do the work, and that’s why they must understand and participate from the beginning in this process.

Mike Cohn said…

Hi Ibes—
You’re welcome. I’m glad you find the responses helpful. The discussions in these posts are the most useful part.

You are correct that a kid capable of licking postage stamps cannot relate to how long it will take to perform brain surgery. (For that matter, neither can I!) First, let me reiterate that I am writing in this post (and these comments) about estimating product backlog items, not tasks. The difference between a product backlog item (“user story” often) and a task is that the story will generally involved 3-4 people—e.g., a programmer, a tester, a designer, a DBA, or an analyst, etc. A task belongs to one person. So a team that includes the surgeon is more likely of course to include others who will touch that story—nurse, anesthesiologist, etc. They will be able to comment intelligently on the item. This reduces the occurrence of the problem you’re describing but doesn’t entirely eliminate it. For example, we could include on that “team” the janitor who will clean the operating room after the surgery. That person won’t likely know how long the surgery will take.

Because not everyone can intelligently estimate every product backlog item, in our Planning Poker decks we include a ? card. This card is what the janitor or stamp-licking kid would hold up when the rest of the team is estimating a surgery.

I would challenge a couple of your assumptions about team heterogeneity. Nowhere have I said that “at some point, any member of the team can do the work.” That’s very false on most teams and especially software teams. We may have people who can do more than one person’s job but it’s rare that all can do all jobs. My view is that we should assume the right person for the job will do the job. That is, when it comes time to schedule the surgery, we will do it at a time the surgeon is available to perform it. On less silly, more realistic, examples, we sometimes find people working outside their specialty (a programmer writing tests, perhaps) but that is done because it improves overall team throughput, which is a greater goal than maximizing individual productivity.

Andy Hu said…

Hi Mike,
 
  Thank you for your post, it really helpful.

  I agree that story points is not purely the complexity of thing, but the effort the thing take the team to do it. But I do worries that when we do the estimation by time, it will relevant to individuals’ capability and experience. It implies the team member to estimate base on themselves but not the whole team.

  When we estimate using story points, which is an abstract of “the effort taken by the team to complete one thing” , it will help the team to estimate by a team but not individual.

Mike Cohn said…

Hi Andy—You’re absolutely right. By using the abstraction of points we allow people with different skillsets or proficiencies to discuss the effort involved in a user story (or product backlog item, more generally).

Emilia said…

Hi Mike,

Your post was very helpful, especially the examples.
I am struggling with the fact that lots of times the team says that they cannot estimate the story (they cannot evaluate the duration of the story compared to others) because the story doesn’t resemble with the ones in their past.
I am talking about a component team and they are implementing new features. I am not sure how do deal with this, especially when it’s a constant reason behind not estimating? Therefore they normally split the story in tasks to help them provide an estimation.

Mike Cohn said…

Hi Emilia—
I’m glad you found the post helpful. A lot of times when teams express a reluctance to estimate for any reason it’s because they don’t feel safe with the process. Team members are used to be “beat up” with their estimates by their bad managers of the past. Part of our jobs is to create safety around the estimating process so they’re comfortable with it. They need to know we won’t yell when something takes longer than expected.

Because you say your team is doing this constantly, I’d suspect that this is at least part of the underlying problem. As part of creating safety, I like to tell the team to estimate in “buckets” rather than precise numbers. This is part of what I coach teams to do with Planning Poker. With Planning Poker we might use numbers like 1, 2, 3, 5, 8, 13, and 20 (and perhaps larger for early, rough estimates). Each of those is a bucket. The 20 isn’t saying “this is exactly 20 points.” It’s saying “This item is in the 20 bucket; it’s bigger than 13 and up to 20.” Thinking this way is one way for a team to feel safer with the estimating process.

Also, I’m sorry for the delayed reply. We switched blog technology recently and I just discovered the new tool has a default limit of displaying 100 comments on any one post. So I couldn’t see your comment at first.

Ven said…

Mike,
Thanks for the knowledge harvesting on the story point estimation, however it seems the confusion still persists. I have yet to read your book. Currently we follow these steps in estimating the story points.. First we size the stories on the complexity and the risk it contains on a Fibonacci series. So we have the estimates as S1, S3, and S5 etc. for each story. Once done then the team starts assigning story points to the stories based on the ideal effort it thinks it will take and divides by 2 (we choose 2 and there was no ration logic behind it). So if we feel story sized S1 takes 4 days and we assign story point 2 and if other story sized as S1 takes 5 days then the story point will be 3. But if the team looks at a story sized S1 and feels it takes 10 days then we look back the size rating and see which has to be changed, the ideal duration or complexity. Once agreed we allocate the story point to that story. Thus if we add all SPs’ and multiply by 2 we have an approximate duration for an ideal team. Let us say that all SPs’ add to 100 then approximately it will take approximately 200 days for an ideal team to finish it. If we have better team than the ideal so there will be a better velocity and hence it may take lesser duration. During the next iteration planning the team takes up the assigned stories to the iteration and creates the necessary tasks and hours for the tasks. The SPs’ will give some rough guidance on the stories we can take up in the iteration. So if we have 200 working hours in the iteration then we pick stories worth 100 story points and we have been normally not so off the mark. It also helps me in giving cost to the required stake holders. So far our estimates, burndown are all within a very small variation and even with the changes to the team members we have not seen much variation. Any comments or some tips ( I will read the book,  just ordered today) ??

Mike Cohn said…

Hi Ven—
There’s a lot in here that I don’t see why you do such as directly estimating complexity (who cares if something is complex except as that influences the effort involved, which is the point of this entire blog post), dividing by 2 and then multiplying by 2. I don’t have any specific tips other than the ones I’ve written about here and in the Agile Estimating and Planning book.

Mike Cohn said…

Thanks, Fabrice. I appreciate that.

Sven said…

Very interesting post, I have to admit that I did not read every single reply. I guess I got the big picture though.

The discussion about story points seems to be ongoing - with no solution. In my opinion storypoints should tell something about effort - but should never relate to time.

I like the metaphor of two points with a certain air-line distance between them. The effort would be that distance, whenever you want to travel from a to b you will have to travel at least that far.

Time comes in as traveling speed - velocity. The better your car, the faster you will arrive. The car is the team, the more experience they got, the faster the car can drive.

Your working environment is the roads you have installed. A straight highway will allow direct and fast movements, winding roads will slow you down and make you travel farther.

There is no risk in there as in brain surgery, but risk could be defined as the consequences of not reaching b - no matter if it is because your car has technical problems, you’re having an accident or you are heading the wrong way…

This does not answer how to define effort / distance in software development. But: it helps me to not confuse infrastructure or time needed for implementation with storypoints. I guess teams have to develop a certain feeling of what to consider in their own working environment.

Mike Cohn said…

Thanks, Sven. I’m glad you found this post interesting.

Yes, the debate is ongoing but I am confident there’s a solution. I can be very flexible on lots of things but I have no doubt, none, that story points are an estimate of effort and that effort is time. We can use them to estimate something different if we want but then they become predictors of that other thing—-they can no longer be used to predict the duration of a project. And that is what our bosses, clients, and customers want to know, “When can I get this?” Bosses never want to know “how complex is this?”

I like your analogies with cars. I’ve used similar ones often. In my classes, I often ask people to estimate “how long to drive from here to some-other-city” and I try to pick somewhere that’s a long drive (say 20-30 hours) so it’s not easy. Most groups make assumptions and give me an estimate. Others don’t give me an estimate and rather than list assumptions they give me back the questions that underlie the assumptions (“what type of car? which route are we taking?”)

I think it would be a mistake though in estimating to define risk as “not finishing” or “not finishing in time.” That’s of course a risk, but it’s not the risk in the task. It’s more the consequence (a word you used). I fly on Saturday and will leave a certain amount of time in advance of my flight. That gives me maybe a 1% chance of missing my flight. The consequence of missing that flight is waiting two hours and getting a worse seat on a later flight. If I were flying from Denver to Europe, the consequence might be greater—waiting a day, getting to Europe late for a class, etc. The risk of missing the flight would be the same 1%; the consequence would differ.

And, yes, teams need to develop a feeling of what to consider in their own environments (as you say). That’s why I included risk (and uncertainty) so often in the comments as a *factor* in the total time-effort a story will take. Some teams take on more risk than others purely as a matter of their environment or project.

Thanks for your comments.

Sven said…

Thanks for your reply, Mike.

You made some points that I do agree to. We need to be able to talk about durations - otherwise management will not be happy with our agile approaches. I do agree that there is some correlation between storypoints and time, but as you said - storypoints are effort. Effort does not have to be time. It may be any other dimension you choose. The value should decrease with every task you finish though. And then: Whatever way you choose to complete all your tasks will influence the time needed for finishing your work.

I still think duration is a function of velocity and effort. Teams often start estimating storypoints as time needed for development. Unfortunately most of the time this means that they are thinking about all tasks they might have to do, adding up small amounts of time. The sprint commitment is done once all tasks add up to the sprints length. This has some major drawbacks:
a) Teams stop challenging themselves. They go into detailed plannings and add some buffer just in case that something goes wrong. The team will not reach it’s best performance.
b) Often the product owner just needs rough estimates to be able to adjust the priorities in the backlog. The team needs a lot of time for task-based estimates of stories. Those tasks might not even be implemented any time soon - maybe never. You just don’t want to invest into that kind of estimation.
c) If storypoints are the same thing as time you would have to reestimate your stories once your work environment has changed. This could be the case when switching technical infrastructure - but also if a teammember is on a sabbatical - how can the product owner make his release plans without reestimating all stories? (He could, if storypoints are not time by remeasuring the velocity of the new team - a single value without touching any of the estimates already done by the team). In each case the team has lost a lot of time during estimation on a task-based level.

Wouldn’t it be better to use storypoints / effort as the relative “size”  of a story and use the combination of velocity and storypoints to make predictions on how long the implementation will take? I think this way teams can estimate quicker, which will result in less overhead and higher productivity.

I am curious if you can convince me that I am wrong 😉

Greetings from Berlin

Mike Cohn said…

Hi Sven—
As I said, since story points are abstract you can define them as whatever you want but I see no point in defining them as anything other than something that address the big question estimating is an attempt to solve: telling a boss, client or customer when they’ll get something (or what they’ll get by a certain date).

To say that “effort does not have to be time” is to re-define well-established terms. As much as I’d prefer not to reference PMI on a largely agile blog, I suggest you see http://project-management-knowledge.com/definitions/e/effort/

Duration is absolutely a function of velocity and effort. That’s the point.

To say that story points are “size” is to obfuscate the issue. All that does is mean we have to define size. What can size possibly be but some measure of the amount of work there is to do. And that brings it right back to effort.

I thought about this while at the gym today. I picked up some dumbbells to see which were the heaviest I could hold 2 of in one hand. (Holding two dumbbells is awkward plus can be heavy but it gets more awkward than heavy.) I found that I can hold two 40-pound (18kg) dumbbells in one hand and could have walked them across the gym. Alternatively I can hold a single 120-pound (54kg) dumbbell in each hand and carry them across the gym. I would then say:

Carry 4x40 pound dumbbells = 1 story point
Carry 2x120 pound dumbbells = 1 story point

This would be the same effort (there’s that word) for me even though in one case I have 80 pounds in each hand but have 120 pounds in the other case.

If I had used an alternative definition of effort such as weight, how could I determine the duration required to move 4x40 + 2x120 across the gym? I couldn’t say I do 160 pounds “per iteration” (4x40) because sometimes I do 240. I couldn’t say I do 240 per iteration (velocity =240) because sometimes I only do 160. I could do a bunch of iterations, some getting 160 and some 240 moved and decide I have an average velocity of 200. But that’s wrong too because any backlog of a given size should take the same time to complete as any other product backlog of the same size—-otherwise we have serious problems. So if I have a product backlog that weighs 480 pounds I should be able to divide that by velocity and get the duration. (You said so yourself above.)

But if that 480 is made up of all 40-pound dumbbells, it will take me 3 iterations.

If that 480 is made up of all 120-pound dumbbells, it will take me 2 iterations.

Story points have to be a measure of the thing our bosses,clients and customers care about—-the time (effort) to do something. Otherwise, later dividing by velocity will get a measure of something other than duration.

Sven said…

Hello Mike,

thank you for the response. I took quite a while to re-read the posts and think about everything mentioned. Most of our little discussion is about wording. After all it makes sense to say that effort is time - or let’s say directly related to time. Still I would try to avoid telling my teams that they are using storypoints to estimate time. I think it is crucial to avoid them focusing on the date something finishes instead of the relative amount of effort needed in comparison to other stories. The thought would be: If duration = time AND effort = time THEN effort = duration.

You also said that effort defined as time is well established - well, at least after translation into German this does not seem to be true. I’d rather define effort as the amount of work that has to be done to achieve something. Of course we usually will measure work in units of time - at least as long as deadlines or employees are involved. Normally employees will be paid based on the time they work. In some situations this might not be the truth. There are situations in which you will pay for work being done and you do not care a lot about when it will be finished. Physician would rather use units as joules or watts. I have to think about my battery recharger in the cellar. The effort to completely recharge my car’s battery is the power that has to be used. It is also the electricity I will have to pay for. Another battery recharger might be faster, but it would not recharge a bigger amount of energy and it would not cost me more. The effort is still the same, not related to time at all. (I have to admit that waiting for a whole day sometimes is quite annoying)

A last thing is your example with the nice surgeons tool, cutting the effort into pieces. As long as the tool can only be used for some stories this it seems true - otherwise the velocity will go up and down depending on what stories you are working on. It might be different if the tool will change the time (duration, not effort) needed for every story in your backlog (e.g. improving the speed of your continuous integration process, if CI is related to your definition of done). In this case you might want to adjust the velocity instead of the effort. That way your team does not have to re-learn the meaning of a single storypoint.

Thank you for this blog, I really did learn a lot again.

Mike Cohn said…

Hi Sven—
I’m glad you’ve found all the discussion here helpful. So have I.

I want to clarify that I don’t think effort=duration=time. Let me just use hours for this one example to avoid confusion with points. You and I are going to paint my office. (Thanks for help!) We look at it and think we can be done in 4 hours. That’s you and I both working for 4 hours. In that case we would say “effort=8 hours” but we would say “duration=4 hours.”

Duration is how long the project will take given whatever staffing assumptions, critical path issues, etc. apply.

This is nearly identical to your battery recharging example. But to discuss that I’d have to remember if it’s volts x amps = watts or if I have that backwards. I think about it this way: The effort to charge my fully discharged iPad is the same whether I use the 5 watt iPhone charger or the 10 watt iPad charger. The duration will be twice as long with the iPhone charger though as it only puts out half the watts.

Sven said…

Hi Mike,

You’re right about volts * ampere = watts (which you do not need to remember in that example) and for sure that duration and effort are not the same. I think this is accepted most of the people out there. I think it is important to make it clear to everyone who did not think through all. (effort = time = duration is wrong, but it is not obvious as “time” may be used as replacement for effort and for duration with different meanings. Two geeks might finish a painting job after four hours, still they invested 8 hours of work leading to 4 != 8 ->  effort != duration)

My conclusion is: Storypoints are directly related to time in that way that every storypoint - in a perfect world - will consume the same amount of time (investment, not duration) during development. To estimate stories with storypoints you will have to look at different factors such as complexity, doubt, “size” and whatever seems to be important in your working environment. You should be aware that it is only an estimation and try to not focus on details for every possible task. (You would not be very successful since using fibonacci rows will automatically lead to clusters of stories with the same value but different efforts - it is not meant to be very precise) It is more important to have stable values which reflect relative effort for all the stories in the backlog .

OK then, we probably can start painting your office. I hope you will have some nice chilled beverages around. Thanks for your refreshing commitment, I really appreciate that.

Mike Cohn said…

Hi Sven—
I think your conclusion paragraph says it quite well. Thanks for the great discussion. And I’ll have the beverages ready in my office. 😉

Tamás Bíró said…

Wow. I have read about two years of comments… lots of good advice and interesting discussion. 😊

One thing I would add. When you (most of you) refer to stakeholders “only counting time” and not caring about anything else (like hard thinking), I felt a little scorn in your words. It may have been unintentional, but I feel I need to “defend” stakeholders.

The primary reason for stakeholders counting time only is that in almost all cases, employees are paid for their time. In most companies, payroll is the most significant (if not only) cost factor in software projects. Monthly or weekly salaries are standard in companies around the world, so if someone works 2 months instead of 1, it will automatically cost twice, just on the payroll, not to mention time to market and other chain reaction like effects (such as brand deterioration due to late shipment, etc.).

Time is very important for businesses, so we (involved in agile development) have to understand and agree with business people on this.

So managers, marketing or sales people are not “some evil” we are forced to work with. These guys sell what we develop, and this is where budgets (for payroll, heating in our rooms, hardware, software, scrum training even Mike’s books) come from.

We have to remember, that our primary goal is to deliver business value, and a software shipped too late (or simply not knowing when it will ship) is not very big value.

Mike Cohn said…

Hi Tamás—
I don’t agree with your point that management is not “some evil we are forced to work with.” I’ve never liked the us/them mentality that has been part of some agile since the beginning.

If, however, you interpreted any of my comments about the business caring only about time (not complexity) to be scornful, that was a misinterpretation. I am merely pointing out that time is what the business cares about (and rightfully so), so that is what development teams need to estimate.

Tamás Bíró said…

Hi Mike!

Thanks for your reply. I think we are on the same side. 😊

Still, you wrote “they”, and “they don’t care”, let me quote: “Clients, bosses, customers, and stakeholders care about how long a project will take. They don’t care about how hard we have to think to deliver the project, except to the extent that the need to think hard implies schedule or cost risk.” This is what made me feel strange.

I think we have to stress two things.

One is that “they” count time because it IS important for the business, not because they are stupid/evil/ignorant/incompetent. I hear developers talking about stakeholders as “some evil” every day, and this is what I “felt” when I was reading this. I have read your books, and I know you DO NOT think like that. Still, someone not having read your books might interpret your words the other way, this is what I say. Good stakeholders (especially managers) must also care about other things, such as the job satisfaction of the team. And good stakeholders do. But the market does not. I think this is what people misunderstand, that the market is ruthless (eg. time to market, fixed budget contracts), and stakeholders (managers, marketing) have no choice but to reflect this.

The other thing is that scrum teams should care about time just as much as managers do. Time constrains are viewed by many as something coming from “outside” of the team, but in my opinion it should come from “inside” the team. You are totally right, there is no “we” work for “them”, but unfortunately many people think there is. And to some extent they are right, since “they” hire “us” for the job. I think teams should have a basic understanding of the market, too. PO-s usually have this understanding, but rarely do developers. When I was a freelancer developer (15 years ago), I really enjoyed “maximizing work not done” myself, but most of my peers ridiculed me for “not writing perfect code”. I always delivered on time and was payed on time.

These things are important to me, because I think they affect estimation. And not in a positive way.

So estimation must reflect time. When teaching estimation, I would put more stress on why we need to do time estimation and why no one seems to care about complexity.

Mike Cohn said…

Hi Tamás—
I’ll stand by what I wrote in saying “they [the business] don’t care” about complexity. I don’t mean anything bad by that at all. Any inferences you bring to that are your own. I could similarly have written, “The programmers don’t care if the company is paid in Euro or dollars.” As a programmer I’ve never cared about that although I’ve cared deeply that the company did get paid and that it was doing well.

I think the business (and the developers) should be very concerned with how long things will take. That’s why I’ve largely made estimation the focus of the last 12 years of my life.

I can’t write every blog post with every caveat for every possible reader. Each would become tedious for those (like you) who do know I very much believe in whole team responsibility. People will read into these posts (or a book) what they want. I’ve been amazed at what people have quoted me as supposedly saying. I can often figure out someone in a course or conference heard what I said, interpreted through their own opinion and came up with something I never meant. Words are a dangerous thing.

Sachin said…

Hi Mike,

I personally like the doctor and the child analogy that you have used to explain, that story points are all about effort.

I have collected data from my Scrum teams and found that sometimes, a 3 point story takes less time than a 2 point story. And often teams and myself get confused by the notion that if it is all about effort, then invariably a 3 point story must take more time than a 2 point story.

Can you help explain this scenario?

Sachin

Dwayne said…

Mike, I’m pretty sure you never imagined this post would bring so many views and opinions!
Although I am a big fan of your work and writing, I have to disagree with the sole purpose (or what I perceive you to be saying is the purpose) of estimating.  I have been coaching agile practices for many years now, and have found that the greatest benefit of estimating is not giving a number to a C level that says when we will be done, but rather the understanding that occurs to come to that number.  Take your runner example, if the runners disagree on the fact that one trail is twice as long as another, that’s an opportunity to discuss what the trail really is.  Does one runner assume the terrain is much more difficult on one trail then another?  That brings in the complexity aspect, but it does in fact affect effort.  If the sizing discussion between the runners leads to an understanding that both trails are actually the same terrain, then a better understanding of what is needed (the feature) comes about.
Yes, agile estimating does need to also satisfy the C level’s.  But isn’t agile really about the fact that we don’t know (with any great certainty) what we are building, because we are learning as we build, and so those ‘estimates’ are not all that much more reliable than the waterfall equivalent?

Mike Cohn said…

Hi Dwayne—
I’m glad you generally like what I write but am sorry we disagree here. I’m not sure how strongly we disagree though since, as you point, you are inferring my opinion rather than quoting me on it.

Nowhere have I said that the sole purpose (or even primary purpose) of estimating is to give a number to a C-level executive as you think I might be saying. Instead, I have said that the purpose of estimating is to make better decisions—those decisions could be ones made by C-level executives but they could just as easily be ones made by the product owner or the developers on the team.

As for complexity, please note that I did not say complexity is irrelevant. I said that we are not estimating complexity directly. We estimate how long something will take and complexity is a factor in that. Helping my daughter with her calculus homework tonight was more complex for me than flipping heads 100 times in a row on a coin. Flipping heads isn’t complex at all. However, doing 100 times in a row would take me a long time. Complexity is a factor, not what we estimate directly.

In your first paragraph you shift from “sole purpose” of estimating to “greatest benefit.” I can agree that the greatest benefit that comes out of sprint planning (and the estimating in that) is the greater understanding. In my CSM classes I make the point that sprint planning is a bit misnamed as the meeting is really planning + product design + technical design and that the latter two are the more important activities.

As for “sole purpose”, though, please show me a team that has ever said, “We need greater understanding of what we’re building. Let’s estimate!” That doesn’t happen. So, greater understanding may be a great outcome of estimating but it is never (that I can recall) been the sole or primary purpose of estimating. We estimate to make better decisions. Yes, sometimes those decisions are made by C-level executives.

There will always be many problems with estimating but there are things we can do to be better off than in a waterfall environment. Things like use ranges, separate the estimate of the size from the calculation of duration, estimate at multiple levels (release and sprint planning), estimate features, measure progress only on finished features (velocity), etc. All of these are described in Agile Estimating and Planning, as well as in many other places by now. So, while you may see both an agile team and a waterfall team sitting in a row estimating, much of what they produce in those estimates should differ even though a casual glance may make it appear they are exactly the same.

Thanks for sharing your comments, Dwayne.

David Patterson said…

This is an interesting discussion, but I, too, have a problem.

There is a serious disconnect in estimating stuff that was brought to my attention by Dr. George Hazelrigg of the National Science Foundation. He pointed out that part of the reason we (engineers, programmers, and to some extent painters) all make bad estimations is that we are asked the wrong question by project managers, scrum masters, et al. The question we are usually asked is “What is your best estimate of {variable of pain} it will take to do a task?” By “variable of pain” I mean time or money or effort or complexity or power consumption or paint consumption, etc.

The issue that arises from that question is that the answer is a sample from a Probability Density Function (PDF), and it is a PDF that is usually from a “One-tailed Distribution.” If we estimate 8 hours of effort, the task cannot take -2 hours, but it is not inconceivable that it will take 25. By asking “what is the most likely…” you are subconsciously asking “what is the mode of the distribution?” But, to use the result in any statistically valid way, we need an estimate of the PDF, not the mode. It turns out that for one-tailed distributions the mode is always closer to zero than the mean (which is the valid estimator of the PDF). Dr. Hazelrigg felt that this difference is that structural source of underestimation.

The better question to ask (and one that gets to the question of the mean of the distribution) is “If you had 100 people of various skill levels how long would it take them on average to complete this?” This question forces you to think about the pessimal cases and use them in the estimation process.

Sven Peetz said…

@David
These are interesting thoughts which make sense in a way - but my first guess is that when using Storypoints they are not important. This is because the question asked is not “How many days will it take?” but “how long will it take in comparison to the other tasks we have?”. This eliminates the variable of pain - as even if we would get day-based answers like “If that story takes us 2 days, the other one takes 4 days” we would say “the second story takes twice as long as the first. Now - assuming that this ratio is not depending on the PDF (which it might be) it doesn’t matter if our team estimates the median or any other quantile within the PDF - as long as they are consistent… e.g. if 2 days and four days are the correct median estimates and we really estimate the 20% quantile of the PDF, this would turn out to be something like 1.5 days and three days. The rratio is still the same. Since we use experience-based velocity to judge how many storypoints can probably be done in one iteration the only thing changing would be the velocity. This would be absolutely fine as it doesn’t really matter if this is a high or low value.

This is just my initial thougght on that. It might be wrong.

Mike Cohn said…

Hi David—
You are absolutely right that how we ask the question will influence the answer. I do often ask the “100 people” question you pose. Normally I’ll ask it in the form of “if we did this 100 times…” I’m not sure why I’d ask it of people of various skill levels when I have a team of a known skill level. Also, note that in this post I am talking about estimating story points (which will almost certainly involve work from multiple people, unlike a task that will be only one person’s work).

You’re also right about an 8 hour estimate perhaps taking 2 but perhaps taking 25. I wrote about in the first chapter of the Agile Estimating and Planning book as one of the key problems with estimating.

Has Dr. Hazelrigg published anything along the lines of his thoughts on this? If so, I’d appreciate it if you can post some references or email them to me. Thanks.

Vinod said…

Hi, Had one question?What if we ask the Brain surgeon to lick the 1000 stamps, would he still rate it as the same as Brain surgery?

mikewcohn said…

Yes, if the brain surgeon could lick 1000 stamps or do the surgery in the same amount of time, he’d put the same number of story points on them.

Ashish said…

I tried to post earlier also not sure if it went, but I cant see my comment here so resubmitting - Consider a scenario - For a project two systems are impacted. For Story S1 from system A, team submitted 3 points and for Story S2 from system B, team submitted 5 points. Now team A allocated 5 people and asked for 120 hrs, team B allocated 1 person and asked for 40 hrs. Here, story point totally skewed the matrix…what do you suggest should be done here…

mikewcohn said…

Hi Ashish—
If I understand the question, you’re saying that Team A and Team B have different meanings for what a story point is. You can either live with the fact that they have different meanings (quite feasible in many cases) or create a common baseline which is described in this post: http://www.mountaingoatsoftwar…

Ashish said…

First of all thanks for taking time to reply.  What i was trying to say is there are two systems impacted in my projected. So team comprise of representation from system A and B…however after story point estimation system A team member allocated 4 more people along with him for 3 point estimate story but system B team member worked alone for 2 nd story which was estimated 5 point. At the end effort for story 1 was more that story 2 .... So here i believe issue is all people working should be fixed and story point estimation should consider effort required by all people working on it…?

mikewcohn said…

Hi Ashish—

I’m trying to follow this but still struggling. I think it’s because we’re coming from different perspectives of what points are for and when they’re applied.
Story points are about effort. Assume 1 person for a moment—just me. If Thing X will take me 100 hours, I can call that let’s say 10 points. If Thing Y will take me 50 hours I should call that 5 points. That’s ideal. I’m not an ideal estimator so maybe I call them 11 and 4 or 10 and 6 or even 13 and 3. So there’s imprecision. THat’s why I advocate techniques like Planning Poker that use a limited set of numbers—e.g., everything I’d want to call 6, 7 or 8 gets lumped into an estimate of 8.
Adding more people to the question doesn’t change things—since points are about effort if you and I can do THing X in 100 hours still (same as above) then it still gets 10 points. So the number of people is irrelevant and that might be what’s messing me up with your question. You seem to be saying that the team gave Story A more points than Story B then did the work and found the opposite to be true: Story B turned out to take longer than Story A. If so, I’d agree that’s bad. But I’d also agree it can happen—it just shouldn’t happen a lot (depending on how much bigger A was thought to be than B).
These two posts may address the issue better for you: http://www.mountaingoatsoftwar… http://www.mountaingoatsoftwar…

Ashish said…

I think I am getting it now. When my team did estimation, we got direction from someone that story point relates to complexity and not effort, which didnt make sense to me but as i was new to agile, we moved ahead with that approach. Also we assumed 3 point means work will be done in 2-3 days, 5 point means 4-6 days. Now based on complexity, for story A team gave 3 points and for story B, team gave 5 points. However later, system A representor who gave 3 point estimate for story A (remember story A impacts system A and story B impacts system B) said he will allocate 5 resources for 3 days, so he need 5*8*3= 120 hrs. System B representor who gave 5 point estimate for story B said he will allocate 1 resources for 5 days, so he need 1*8*5= 40 hrs.

So this resulted in opp results. I believe had we estimated by associating point with some range of hrs, we would not have in this situation because in that case, hrs and days would be fixed so it depends how many people they want to put to get the work done. Also, i think your post which shows hrs median associating to points will also help…

Please let me know if now i am on right path… And Thanks a lot:)

mikewcohn said…

Hi Ashish—

Yes, you’ve got it now.

A lot of people make this mistake of thinking story points should represent complexity. And it works a lot of the time because complexity is a huge driver of how long something will take. But complexity and effort are not perfectly correlated. Sometimes something is simple but there’s a lot of it so it takes a long time. You ran into a situation with the example here where complexity and effort weren’t perfectly correlated.
Story points have to be about effort because that is what our bosses, clients and customers care about. They want to know when we’ll be done. Not how hard our brains will need to think (complexity). So, we estimate effort.

Chand Warrier said…

If point-based estimation is about “time the work will take”, why is it called Size estimation and not schedule estimation? If its effort and not complexity why don’t we
express it directly in hrs instead of a unit less measure. I look at Story Point (SP) estimation like Function Point (FP) estimate with the only different been FP is scientific in nature and SP is analogy based on Complexity, Volume of work (number of screens, number of table changes etc) and assumptions that your make.

mikewcohn said…

Hi Chand—

Rather than “size” or “schedule”, I would suggest it be called “effort,” which is often what it is called. Schedule estimation to me would imply the sequencing and such of work, which is very different than estimating the size or effort. (Scheduling, for example, could consider things like critical path.)
There’s no “if it’s effort and not complexity.” It is effort and not complexity. The reasons why to not use hours are probably explained in the 100+ comments on the post but to state the main one again very clearly: If you and I produce at different rates, we cannot agree on a time-based estimate. The example I give repeatedly is of walking to a building. You can walk to that building in 5 minutes. I’m on crutches and so it will take me 10. We cannot in any way agree that it is a 5-minute job or a 10-minute. For you, it’s 5. For me, it’s 10. We can, however, agree that walking to that building will take half as long as walking to some *other* building. We could then call the first building a “1-point building” and the second a “2-point building.” [Side comment: think now about complexity. Walking to each build would be equally “complex”.][Additional side comment: hobbling there on crutches could be consider “more complex” for me than for you, in which case “complexity points” don’t solve the fundamental issue of your-time not being additive to my-time.]
I can buy your comment comparing function points and story points. [I’d add risk and uncertainty to what comprises story points.] I’d question, though, whether function points are truly scientific in nature. They are certainly much more objective but not 100% so. I have nothing against function points and have, in the past, hired function point experts to come into agile projects and measure the after-the-fact number of function points a project delivered. The biggest issue with function points is that they are very easily derived in advance, which is why they are so little used in practice.

Chee-Hong Hsia said…

Hi Mike,

I did a research on relative estimations which had a quite interesting result.

The conclusion was that it actually doesn’t matter how wrong the team initially estimated the items, in the end the teams velocity will always auto-correct itself. Due to this result, I often tell my team to “don’t stress and come up with a complexity number. The chances of you being wrong is more highly than being right”.

http://relativecomplexitytheor…

mikewcohn said…

Hello. I completely agree about velocity auto-correcting. I call velocity “the great equalizer.” This works as long as the estimates are consistent. If I call something “5” and call all other things that will take about the same amount of time “5” then velocity equalizes out any error even if in hindsight the things all should have been called “10”.
I don’t, however, agree with estimating complexity, which was the whole point of this post and all the followup discussion in the comments. Complexity is only a factor and is estimated only to the extent that it influences the effort (total time on task).

Guest said…

Mike,

I’m a little confused. First of all, I totally agree that story points represent effort and not complexity.

However, why don’t we assign points to bugs (specifically, bugs that are found after sprint completion and bugs that are inherited from legacy code)? That’s effort, too, right?

mikewcohn said…

Hi Joseph—

It really depends on what the bug is and when it was found. Points on the product backlog items let us know how much work remains—so that argues for putting points on the bugs. But, points count toward velocity and giving a team points for fixing a bug they made in the last sprint overstates velocity so that argues for not putting points on those items.
So in one of the cases you ask about—bugs in inherited legacy code—yes, put points on those. But a bug found after sprint completion (meaning we find it this sprint but created it last sprint), don’t put points on. Here’s more on why: The team finishes a 5-point story in sprint 1 but it is buggy, buggy, buggy. In sprint 2 they discover bugs—so many bugs they decide fixing them all is worth four points. They fix them. Over the course of those two sprints the team got 9 points for what was really 5 points worth of work—it was only 9 points because they screwed up so much.

Mike Li said…

Hi Mike, thanks for your sharing. I had one question. If story point is about the time the work will take. Then what is the benefits to use it instead of use other time-based estimation? I like relative estimation because I think it should not be impacted by the factor that who will actually do the work. That will impact the velocity.

So, the size of lick 1000 stamps and simple brain surgery will be similar. But if the kid need to take the work of brain surgery, the velocity will be very very low because he has zero skills of it.

mikewcohn said…

Thanks, Mike.

As for why to use story points, see the reply to Chand on 6 October, 2013. Also the chapter “Story Points and Ideal Days” in the online video course on Agile Estimating and Planning at http://www.mountaingoatsoftwar… has quite a discussion on the merits of story points. It’s something I should do a blog post on someday but for the information is all in that video (plus scattered throughout comments on this post).
As for the kid doing the brain surgery, two points: Note that in the post I said we assume the right person for the job does the job. That applies here. This is a valid assumption in real life. Second, as pointed out somewhere in earlier comments, while I think the example is helpful it suffers from really being treated more as a pair of one-person tasks than I’d want for real user stories. That is, on a real project, a user story will typically involve 3-4 people not one as a task would.

johnr said…

“it was only 9 points because they screwed up so much.”

But it *was* 9 points. You just said so. Why does it matter why it was 9 points or when the work was done to complete the 9 points? Seems arbitrary.

mikewcohn said…

John—

Imagine a team decides to abuse this. They “finish” a 5 point story and claim it’s 5 points. But they left in a lot of bugs. They fix those next sprint and claim 4 points of credit for “fixing” them. But they didn’t fix them. They screwed up more code. So next sprint they fix that and give themselves 4 more points. Then 2 the next sprint for fixing those bugs and so on. The team could claim 200 points of credit (or more!).
What you’re getting at is really the definition of velocity—is it a measure of how much forward progress a team made (5 in this example) or a measure of how much work they did (9 in the reply above you’re referencing). My next newsletter is on exactly that topic. I think that goes out today, in fact. And will be posted as a new post here in a week, I think. See that for more on the two possible definitions of velocity.

Simon L said…

Thanks for the good post MIke. My tendancy is to encourage teams to point everything - defects, Stories, tech tasks - so we have transparency on where our effort is going and also so we build an accurate and representative Velocity. In reply to your post above, I understand you’re saying that we’ve claimed 9 points for a 5 point Story and 4 points of re-work however is it possible also that the team have under-estimated and rushed the work resulting in defects. By capturing the defect points and discussing them in our Sprint review in relation to the original 5 points Story, we can provide a more accurate representation of Stories of this type for later relative sizing. I.e. maybe this Story was actually closer to an 8 than a 5.

mikewcohn said…

Hi Simon—
A little discussed thing about velocity is that it really can be used to measure either of two things: How fast a team is moving forward and how much they completed. Those can be subtly different. For example, using your example, I’d say the team made 5 points of forward progress (the 4 points of bugs aren’t really forward progress). I could also say the team did 9 points of work. Both can be called velocity and this dual meaning creates lots of confusion.
I actually covered this in my February newsletter. Normally newsletters get posted here sometime after they go out so look for that in an upcoming post.
So, I completely agree with what you’re saying. But some teams would call that velocity 9 and others 5. Both can be right for different things.

Felipe Brito said…

Hi Mike,
Interesting analogy! I have one comment/question: we need x story points for licking 1,000 stamps and x story points for performing the brain surgery at a given sprint….Assuming that the kid and the doctor are working for continuous improvement…I believe it would be hard to perform the brain surgery faster without compromising quality. But wouldn’t it be interesting to measure the x story/complexity points of the kid’s task so in the retrospective meeting they can discuss this and create a mechanism that will “lick” 9 or 16 stamps at a time (no compromise in quality!) so the kid/mechanism can lick many more stamps per sprint (with the same effort!) or have more time to study to become the future brain surgeon?
My point is that is not about effort. It is about the “size”/“square footage” of what we are delivering. I agree with you that Complexity Points is not a perfect name, but it is the one that fits better in my opinion, because it helps comparing different stories. And usually one story that is more complex is larger….
Best regards!

mikewcohn said…

Hi Felipe—

Thanks for sharing this but there are a couple of problems with the argument:
1) I’m pretty sure that brain surgery has improved more over the last, say, twenty years than stamp licking. However, that’s not your main point. Teams should pursue improvements (new mechanisms in your example) in all areas of their work.
2) You make a claim that “usually one story that is more complex is larger”. First, we need to agree on what larger means: I’m saying it is the effort to complete and I think you’re saying the same. That is, complex stories usually take longer to complete. I completely agree and that supports the fact that story points need to be about effort. I said in the post and in many comments that complexity is a factor (i.e., it can make something take longer) but it is *not* what we estimate directly. Your estimate needs to be of how long something will take—and how long it will take is impacted by how much of it there is, how risky it is, how much uncertainty there is, and, yes, how complex it is. But those other factors are only relevant to the extent they impact duration.
I know nothing about surgery other than what I’ve learned from TV, but consider these two examples then: brain surgery and an appendectomy. Let’s assume those can be done in the same amount of time. If you disagree, just pick two different types of surgery. It won’t matter. If they take the same amount of time, they get the same number of points. Yet, I’m just guessing that brain surgery is “more complex” than an appendectomy in some way that a surgeon might assess that—perhaps whichever one can have more things go wrong, whichever has the most possible complications, whichever fails the most, whichever the surgeon gets more nervous before, whichever requires more training, etc.
So, per my claim brain surgery and an appendectomy get the same number of points. By your logic they’d get a different number of points. Brain surgery may be more complex but that is meaningless to the hospital, though, when they schedule the operating room. Remember: They can only schedule 6 surgeries (of either type) per day. The hospital’s “velocity” doesn’t allow them to do more appendectomies just because they are simpler (if they take the same amount of time).
Finally: even if you want to estimate complexity directly, why would you? Consider every boss, client or customer you’ve ever had. Have any ever come to you and said, “I don’t care how long this is going to take. But I’m worried about your brain. How complex is this?” I suspect not. Yet, nearly every boss, client or customer wants to know how long something is going to take.

Felipe Brito said…

Hi Mike,

Thank you for your reply!

My claim saying that “usually one story that is more complex is larger” is just to suggest that we should not bother with the term “Complexity Points”. I will tackle this later on…But this is not the main point.

My main point is that is not about *effort* (hours to do something) is about *sizing* (square footage of what is delivered - and please don’t misinterpret this by number of lines of code…). It’s all about delivering more value to our customers. This goes back to licking 9 or 16 more stamps in the same amount of time (same effort, much more value delivered).

I believe you chose the brain surgery and stamp licking example at first because they were completely different in terms of complexity, but took the same time. The fact that you are now using different types of surgeries, assuming that they take the same amount of time to be performed and trying to tackle velocity meanwhile makes it harder to grasp in my opinion.

If I may, let mr change the example. Please forget about complexity for a moment. We will bring it back later… Let’s focus on size (not effort!).

One client contracts a person to build 2 identical 100 sqft rooms. The person builds the first room with no tools. The room is ready in 2 days. The same person builds the other room (that is identical to the first one) with tools. Room is ready in 1 day.

Same outcome for the client! The complexity is the same, the size is the same and I believe both room have the same number of complexity points. Effort are different!

You don’t believe that the second room has half of the “complexity points”, right? I hope not.

Now let’s make it more complicated a bit…

That same person from the previous example will build 2 other 100 sqft identical rooms. He/She uses the same tools to build both rooms.

The first room is built on a bad weather. Risk is increased. Room is ready in 2 days. Second room is built on a sunny day. Room is ready in 1 day.

Same outcome for the client (he/she does not care whether it was raining or not…He/she wants the rooms 😊

Size again is the same. Effort again is different. Unfortunately bad things happened; a risk materialized. This should have in my opinion the SAME number of complexity points.

Now let’s bring complexity back (in a way that is more common to software development) - and that is the main reason of my claim that “usually one story that is more complex is larger”...

That same person from the previous examples will build 2 other 100 sqft rooms. He/She uses the same tools for both rooms and we are enjoying sunny weather for the week…:)

According to customer needs, the first room has to use a different foundation and has pipes connecting to other rooms. Complexity is increased. Room is ready in 2 days. Second room is built with regular foundation and no pipes connecting to other rooms. Room is ready in 1 day.

We have different complexities and different sizes. First room is “larger”, despite both having 100 sqft. It delivers more value to the customer.

To your last paragraph… if you have the right framework to estimate and consider complexity, you track this back to value delivered. It becomes more transparent to your customer. Moreover, it helps you push for continuous improvement. You may start working with pre-built rooms, for example…When you say “Story points are not about the complexity of developing a feature; they are about the effort required to develop a feature.”, for me it makes harder for us to focus on pushing towards continuous improvement.

In my opinion the most important thing for the client is not “how long something is going to take”, but what value he/she is taking from it.

Does it make sense?

Best regards,

Felipe Brito

mikewcohn said…

Felipe—

If two people do the same thing (build a room) and are in different companies they would likely use different units. If one says “I’ll build it by hand” and the other says “with tools” then the one with tools should come up with a lower number of points.
I have no idea what size would mean in software other than effort.

And of course it is all about delivering value. Please see other blog posts on here about how one cannot optimize the delivery of value without knowing the cost of things. I know some agility say that is the case but they should take a basic college course in managerial economics and truly understand ROI. Optimizing value involves maximizing value, minimizing cost, and selecting among alternatives across an efficient frontier. Saying that it’s all about value is fine but value still considers cost. That’s what this discussion is about.
I still challenge you to find a customer who cares about complexity but does not care about effort. You won’t.
Please go ahead and continue estimating complexity if you’d like. You don’t need to convince me it’s ok. You’ll eventually need to convince a boss, client or customer that how hard your brain needs to work is more relevant to them than how long something will take. Good luck with that.

Felipe Brito said…

Hi Mike,

“If two people do the same thing (build a room) and are in different companies they would likely use different units.”

I respectfully disagree. Everybody uses sqft here in the US and in a couple of other places. The rest of the world uses square meter and one can easily convert them. I am not saying that we are there in the software industry; neither am I saying that it is easy. But I believe that we should strive for this. Starting by same teams dealing with different projects, than different teams dealing with similar projects, than different teams dealing with different projects. We are doing this in my company and we will be writing about this very soon.

“If one says “I’ll build it by hand” and the other says “with tools” then the one with tools should come up with a lower number of points.”

How do you measure productivity increase with this approach? How do you push towards continuous improvement if you hide what you are improving?

On the other hand, if things get worse and you double the number of points to deliver the same room, how do you know that there is a problem and you act to correct it?

“I still challenge you to find a customer who cares about complexity but does not care about effort.”

I never defended this idea, but you insist in saying that I did. My customers don’t care if I spend 10 or 20 hours in something. Of course we are transparent and share this with them - this is agile at the end of the day. And there is some kind of correlation with the effort and the value delivered (as you wrote). What they really do is to push us to do things better every time and to have more software delivered year after year (more tickets per person, more complexity points in a sprint…)

“You’ll eventually need to convince a boss, client or customer that how hard your brain needs to work is more relevant to them than how long something will take.”

No, I won’t. What is really important to them is how health the patient is after the surgery and how good the room turned out. It was not how many hours I spent in surgery or building the room.

Have a great weekend!

Patrick Lamasney said…

Hi Mr. Cohn,

You seem to have changed your mind on story points.

In 2006, quoting you from the book Agile Estimating and Planning: “A story-point estimate is an amalgamation of the amount of effort involved in developing the feature, the complexity of developing it, the risk inherent in it.”

Four years later, quoting you from the blog post above: “So, story points are about the effort involved. Feel free to adjust your estimate of effort based on things like risk and uncertainty, but point-based estimating is about the time the work will take.” 

After almost 8 years of being a Scrum Master, I agree with your 2010 self that time is what our clients, bosses, customers and stakeholders care about.  However, from a strict Agile perspective I agree with your 2006 self more.  I have worked hard to get my teams to use your 2006 definition of story points.

Can you help me understand:  Have you changed your mind or has your answer simply evolved and I’m just not getting it?

Thanks for your time.

mikewcohn said…

Hi Patrick—

Honestly, I don’t see any significant difference between those two statements. In the first I say:
story points are a function of effort, complexity and risk

in the second I say

story points are a function of effort, risk and uncertainty.

Risk and uncertainty are clearly related concepts. So we can ignore any minor difference there.
The only difference is the use of “complexity” in the first (the book) and not in this post. If you read all the comments on the post, though, you’ll see I acknowledge that complexity is still a factor (to the extent it influences the effort involved, which I believe is consistent with the book).
If I ever did a second edition of Agile Estimating and Planning, the big change I’d like to make is to de-emphasize the word “complexity.” Way too many teams grasped on to that and insist on estimating complexity directly—as though our customers and clients care about complexity (except when it influences the effort (time) something will take.)
So: Complexity is a factor to the extent it influences effort. Complexity on its own is irrelevant. So I’d say my way of explaining this has evolved perhaps because that has become more absolutely clear in my mind. It’s become even more clear since writing this post.

johnr said…

Thanks for the reply, Mike. I find a bias towards teams as having “screwed up” not helpful in most cases, as I don’t think of it this way. The team is doing its best. Yes, it is making mistakes. Software development is hard. Bugs are a normal part of development. I find it arbitrary that a team can get credit (points) for finding and fixing bugs within the life of a story, and that is considered perfectly normal and acceptable; but if bugs pop up after a story is “done”, then for some reason they messed up and have to pay for it. This is especially troubling to me if the team is evaluated (or even paid) based on its velocity. There are more factors that result in escaped bugs than just developers messing up. Pressure from client or product owner, misunderstood acceptance criteria, etc.

mikewcohn said…

John—
I was describing a situation in which the team (in my mind) pretty much messed up on purpose as a way to increase their velocity. Just leave a lot of bugs around, claim credit and velocity goes up. If fixing those bugs counts as more points later, even better for the velocity reports on such a team.
I completely agree, of course, that bugs are a normal part of development—they sure are in anything I’ve ever coded! :(

Jeff said…

Hi Mr. Cohn,
I was linked to this blog after we were trying to point a story. The person who would QA the story was somewhat new to the product involved, so it was argued that the estimation should be increased, due to the increased risk/uncertainty. The insinuation was that the QA person involved would need to spend a lot of time conferring with other QA staff who were more familiar with the product in order to setup test cases or what have you.
My argument was that the actual effort was the same, but that this person, being new, would obviously take toward the higher end of the scale regarding time. I think you made an analogy to this effect: The effort to walk to a building is the same for the able bodied person vs a person on crutches, because we use the effort to compare stories to each other, not as a complete substitute for the time it takes to complete them.
What are your thoughts on this? Is it appropriate to take into account training/newness in the estimation? Or if the uncertainty of the person’s skills is such that it would inflate the estimation, are they actually ‘The wrong person for the job’?

mikewcohn said…

Hi Jeff—
Most important point: Just call me Mike. My dad (Mr Cohn) doesn’t know anything about agile! 😊
I think you’ve nailed it. The less experienced person will just have a lower velocity than the experienced person. (I don’t like to think about “individual velocity” but it is sometimes helpful in making a point.) I’ve never programmed in Scala (although I want to learn). Theoretically, though, an experienced Scala person and I could agree that “program A” will take half the time of “program B” to code. The experienced person might do those in 1 day and 2 days. It might be 2 weeks and 4 weeks for me. But we’d be able to agree on the relative size and therefore on a point scale for those. When I get good, it’s not that the size of the work changed, it’s that I got faster (my velocity went up).
I wouldn’t want to think of it as “wrong person for the job” if it’s just a learning curve issue. If we did that, the person would never become right for the job.

Jeff said…

Hi Mike,
Thanks for you reply. Two followups:
First, if we continue the example with the inexperienced team member joining the team for a sprint, am I correct in assuming that the correct way to account for their learning curve is to decrease the targeted velocity for that sprint? That is, we will produce less effort worth of output, but we’re consistent in what we call X effort worth of output. Clients/Stakeholders see that we’re slower, but this is preferable to them seeing less business value for what we might otherwise call the ‘same effort’.
And second, if we don’t consider training a high source of risk or uncertainty, what would you consider as sources of risk or uncertainty in a story estimation then?
The only examples I can really come up with are mostly related to things that are outside of the team’s control, such as: Does the platform we’re using even support this, or will we have to build it ourselves? We’re not sure yet, so we’ll estimate as if we have to build it ourselves.
Or perhaps something more basic: I’m unsure of how much effort this will really take, so I’ll estimate on the high side.
Thanks for your time,
Jeff

mikewcohn said…

Hi Jeff—
Yes, I would expect a lower velocity while that novice team member is learning.
I don’t recall saying that training is not a source of risk or uncertainty but perhaps I did. I would say, though, that an untrained team member is a source of risk and estimates that person gives will be less certain.
Other examples of uncertainty could be:
I think this will take 5 points but I’m assuming we actually did that refactoring last year but no one seems to remember for sure if we did.
This will take 8 points if the design approach I just described will work but I’m just not sure whether we have those hooks into the other system we wrote. (Completely under our control but uncertain about whether something exists.) If we don’t have those hooks and have to build them now, this is more like 20 points.

Bob Lieberman said…

This post sure got a lot of attention! I focused on this phrase of yours, Mike: “we assume, in general, the right person for the job will do the work”. I’m interested in your thoughts about how multi-team companies accommodate this guidance, and what effect that has on the value of velocity as a continuous improvement feedback indicator.
In my company, the product is extremely complex. Domain knowledge (that is, knowledge of the way one programs within the framework of our product) is at a premium. And the star performers are 10x or more as good as the average—that is fairly typical in the industry. So the company tries to move the stars to the teams that need them at the moment, where they may stay for just a few sprints and then move on to the next needy team.
This practice disrupts team formation and also causes a team’s velocity to rock n roll every time a staffing switch is made. I find myself using velocity to keep people focused on planning and commitment, but as a continuous improvement measure there are always good reasons why it can’t be relied upon. As a result I have to rely more on the instincts of myself and my team for improvement opportunities. (And we do find them!)
We do have a few highly stable teams here, teams that have been intact for a year or two. They have a totally different experience than the teams with the revolving door.

mikewcohn said…

Hi Bob—
I don’t suggest using velocity as a measure of team improvement as it’s too easily gamed. Doing so really ruins its value as a metric for other purposes. It sounds like you agree with that.
As for how companies handle the need to move people: Most will go ahead and do so as needed (and as it sounds like you do). The best companies, though, will temper that by knowing that moving a superstar from Team A to B for one sprint and then back to A may not really have the desired effect in such a short term. Teams do best when they get to know each other and moving a person on short-term can be disruptive. Personally I’d strongly avoid it unless it was for a minimum of a month (e.g., two 2-week sprints). I guess I’d do it shorter-term in a crisis but I don’t think it’s generally a good idea to move people that often.
If I ever were tempted, I’d want to look instead at my team structure. There’s probably a better way to organize such that I’m assigning a whole team to the big sudden challenge. That is: I’m more comfortable shifting the focus of an entire team (perhaps stacked with more than its share of superstars) than I am moving individuals between teams.

Bob Lieberman said…

This is good insight, Mike, thanks. We have been able to keep the reassignments in-place for a couple of months so at least we’re reaping some team benefit.

mikewcohn said…

You’re welcome. And that’s wonderful to have kept them together for a few months. Often the first month or two is the hardest as you’re breaking an organizational habit (of redirecting people). Having done it now so long, though, you can start to use the experience to argue in favor of continuing to keep teams intact whenever possible.

Raphael Amorim said…

I think I’m not understanding you two! If you have a table like this:
Story Points x Approximate Engineering Time
1 x < 1 day
2 x 1 day
3 x Couple of days
5 x Few days
8 x 1 week
13 x Couple of weeks
21 x Few weeks
34 x Several weeks
55 x Couple of months
89 x Few months
144 x 1 yahren
233 x 1 blue moon
You’re not using scrum, let’s call it something else. If story points are just time-based you don’t need to do this Top-down estimating, go straight and estimate the task in hours, days, yahrens or whatever. You can easily generate the “story points” from the estimates that you have for the tasks by adding them up and see the equivalent “story point”. If story point is a function of effort, risk and uncertainty (It’s more than that: I call that Complexity and that got better over time because risk and uncertainty go down as the team evolves) the last 2 are not time-related, so Story points have some influence pushed from time, but not 100%, not even close to 100%.
If you can’t say quickly if your team can handle that amount of “complexity” during the next sprint based without going to the tasks with just whatever you’ve learnt during the past sprints then you’ve have no reason of using story points.

mikewcohn said…

Hi Raphael—
Why would I ever have a non-linear scale? If 1 something = 1 day, then 8 somethings must equal 8 days, not one week. So, I would never, ever have a table like yours.
As for all the reasons why complexity is a factor only to the extent that it influences effort, I don’t think I have any more to add than has already been covered in the 100+ comments on this post.
I’ve blogged elsewhere on the site about the reasons for estimating the product backlog (in whatever unit) you choose. There are two reasons to estimate the product backlog items: (1) so that the product owner can use the information to prioritize (If item A is cheap, the PO wants it now; if it’s expensive, the PO wants it later); and (2) so that long term predictions can be made (how much can we finish in three months). There would be no reason to ever “generate story points from the estimates you have for the tasks.” That would be too late to achieve either of the benefits of points.

Seema said…

Hi Mike, Please help me by defining Effort , if it is not complexity nor time nor clarification. What does it stand for in Scrum?

mikewcohn said…

Hi Seema—
I think the PMBOK definition of effort is the standard one (and works well): Effort is the number of labor units required to complete a schedule activity or work breakdown structure component.
So to simplify: Effort is the amount of person-days it takes to do something. In contrast, duration is the number of calendar days something takes. So: Packing the contents of my house might take 5 person days (effort = 5) but it can be done in a day (with five people).
All this is true with points as well but the definitions are more clear I’ve found with good ol’ time examples.
And: To be clear: I’ve never said story points are not time. They are time (more precisely, effort) and effort is influenced by complexity, risk, and uncertainty.

Seema said…

Thank you Mike for explaining it in such a simple way. Now I am very clear on what effort means.

mikewcohn said…

You’re welcome.

Erin Hansen said…

‘effort’ seems a lot like ‘time’ here.
Suppose the Brain Surgeon leaves the team. But he’s still present at the planning meeting.
1000 stamps is still 5 points.
The Surgeon says: Since the brain surgery, which we’ve done many times, was always 5, it should still be 5 points. Even though I am leaving, the task has not changed.
Just to get it right… the task hasn’t changed but the team has, right? So all of a sudden, brain surgery becomes 13 points (or even more).
Right?

mikewcohn said…

Hi Erin—
First, we need to be a bit careful with this example because there are two big flaws with the example when pushed too far: a) I’m describing this really as tasks like on a sprint backlog (i.e., one person can do brain surgery alone!) b) We’re using an example of something super duper highly specialized
But, what would happen here is that the new brain surgery is the same size as old brain surgeries so it gets the same number of points—but what changes is that the team’s velocity would go way down. (Again, ridiculously because of the two issues above with the example.)

Cherry said…

Hi Mike,
I really liked your blog but in one of our scrum training it was told that basic principle is story point is always a relative measure of complexity and should not only consider efforts . But I am confused with your statement that its effort which is influenced by complexity. Does this mean when a team is doing point estimation they should give the story point based on the efforts not complexity. Also in one of your comment you mentioned we cannot call it as time based estimation since it will be different for each individual In this case what is the definition of efforts. Ultimately we are accounting efforts as time spent by man per day. Sorry if you have replied to similar question earlier coz there is a huge list of comments and I might have missed it .
Regards
Cherry

mikewcohn said…

Hi Cherry—
I’m glad you like my blog; thanks for letting me know.
Whoever did your Scrum training and said that a “basic principle” is that points are a relative measure of complexity and should not consider effort was absolutely 100% wrong on two fronts:
—there is no “basic principle” about this. Just look at the quantity of comments and debate. —saying story points are complexity is wrong. No one cares about complexity except to the extent it affects effort (a point I’ve made in the post and just about every reply to a comment)
So, yes, this means a team should estimate based on how long something will take.
I couldn’t find a comment where I said we “cannot call it as time based estimation since it will be different for each individual”. (I know you weren’t directly quoting me. I searched for “cannot” and couldn’t find anything that seemed to match what you meant, though.)
Let me try it one more time:
You and I look at building a certain distance away. You think it’s a 5 minute walk. I think it’s a 10-minute walk because I’m on crutches at the time we have this discussion. You and I cannot agree. You are right that it is 5 minutes (for you) and I am right that it is 10 minutes (for me). That is why we cannot just say screw it and revert to just some simple estimate in time. In minutes, hours, days, etc. the problem is intractable: you and I cannot agree on an estimate if we produce at different rates.
However, you and I can agree that building is “1 unit of time away”. You’re thinking 5’ and I’m thinking 10’ when we agree to that. You then point to another building and say, “That building is twice as far. It’s a two.” You are thinking it’s a 10’ walk and I’m thinking it’ll take 20’ on my crutches. We can agree. We can agree even though we produce at different rates.
Now let’s talk about complexity: walking to either building is equally complex. If we estimated just complexity we’d call each building a 1 (or whatever but they’d both get the same value). What in the world would the point of that be? That’s what I’d want to ask your Scrum trainer about the basic principle.
Let’s continue with complexity. Now suppose we point to a third building. It’s physically the same distance as the first building (which was a 1) so we might want to call it a 1 as well. Except to get to the third building we have to walk about a very narrow path over a nasty plunge into hot lava below us—say a foot wide or whatever you think of as narrow but traversable. Walking to that building is *more complex* so even though it is physically the same distance as the first, it will require concentration and balance to walk there. We will each walk more slowly to that building—we might call it a 3 or a 4 if we think it will take 3x or 4x as long as the first building.
If we were only estimating complexity, I have no idea what number I’d put on that narrow walk. I really don’t—how does one estimate complexity? The only way I know is as it affects something else—and in the case of story points, we estimate the effort (time) to do a thing and that effort can be affected by risk, uncertainty, or complexity.
I hope that helps. My big regret in life is that I didn’t shut the door harder on this fallacy in my Agile Estimating and Planning book. It is a huge mistake to think of story points as being entirely based on complexity.

Cherry said…

Thanks Mike and next time during story point estimation i will make sure we consider efforts.

Tina hank said…

Hi Mike ,if story point is all about estimating an effort then why we use story point on like 1,3,5 instead we should give hours for each stories .bcoze ultmately efforts will accounted in hours or many per day .

mikewcohn said…

You’re welcome.

mikewcohn said…

Hi Tina—
I believe this is addressed in numerous other comments here. The primary reason to use story points at all is because doing so allows people with different skills to discuss estimates.
Imagine two team members we’ll name Fast and Slow. Fast can do something in let’s say 1 hour. Slow will take 2 hours to do the same thing. There is no way they can agree on how long it will take. Each is right about his or her own estimate.
However, Fast and Slow can agree to call this thing “1 point”. They can then subsequently agree to call something else that would take twice as long “2 Points”.

Anon Non said…

Exact effort isn’t always known when estimating. Sometimes hidden complexity comes out only after sizing and even assignment. Would you increase the points based on that?
PS - I love your blog too!

mikewcohn said…

Hi Anon—
If exact effort were known, we wouldn’t have to call it “estimating”! 😉
Re-estimating is tough. If you re-estimate too often, you really hurt predictability by mixing things that were estimated without real knowledge of the work with things that were estimated once the work began. That sounds ok but consider that the entire product backlog is all unstarted. So, re-estimating almost always results in overstating velocity. That can feel good until someone divides by velocity to predict a completion date.
I’ve got a full blog post on re-estimating here: http://www.mountaingoatsoftwar…

Hind said…

Hi Mike,
Thank you for this interesting article.
I’have 10 years of experience in project management using traditional management methods.
The project I’m currently managing is an agile project that takes 4 times more than the effort planned at the beginning of the project.
I am preparing my Phd on the subject of estimating the effort of agile projects.
I would like to know if you have any ideas that can help me in my research.
thank you in advance,

mikewcohn said…

Hi Hind—
I’m glad you found this article interesting. I don’t have any specific ideas for your Ph.D. research. I suspect you can find some interesting topics on this blog or in my book. Testing of any of the ideas here would be wonderful—while I collect data on some things, much of what I “know” is anecdotal and proof (or disproof) would be great.
My favorite researcher is Magne Jorgensen. He, along with a few colleagues, is doing the most interesting research on estimating (agile or otherwise). I suggest looking at some of the things he’s published. I always find them a great source of ideas. His page is at https://www.simula.no/people/m…
Good luck on your Ph.D.

Hind said…

hello
thank you for your feedback.
i’m trying to develop a model of estimating velocity. wich factors and how they affect velocity and project estimation.
im trying to focus on velocity. i have experience on working and implementing CMMI and im trying to figure out how to implemnt risk management based on velocity.

Stijn Hoeke said…

Hi again,
So if i’m correct, you state that ‘1 unit of time’ or ‘one story point’, represents an amount of time which could be different from person to person, like in the example where it’s 5 minutes for person one and 10 minutes for the person with crutches. I see that this makes sense when you have a (cross-functional) team who doesn’t want to have estimations based on just one person’s speed or skill.
In Henrik Kniberg’s book “Scrum and XP from the trenches”, he says “our unit of estimation is story points which, in our case, corresponds roughly to “ideal man-days”, which he uses to estimate the velocity of the next sprint. Of course if it works for him, he should keep doing it but do you think he is doing it ‘wrong’? Since the amount of time for one story point is the same for everyone.
And if you estimate the velocity of the same upcoming sprint 2 times, in the same way he does in the book (p.26) but in one case you use his story points (which represent ideal man days) and in one case you use your type of story points (which could mean ‘anything’). Won’t that result be different since the completed points in the prior sprints will most likely be different?
Looking forward to hear you thoughts on this.

mikewcohn said…

Hi Stijn—
It isn’t just that a team doesn’t “want to have estimations based on just one person’s speed or skill.” It’s that that doesn’t work [well]. I have a hard enough time estimating how long something is going to take *me* to do. How can I estimate how long it would take that one person?
So, yes, based on that quote from Kniberg’s book, he was not doing what I’d recommend. The primary benefit of points is that they allow individuals of different skills to communicate, as described at http://www.mountaingoatsoftwar…  If you’re going to estimate in ideal days, by all means call them “ideal days” and avoid the confusion.
Note, however, that Kniberg wrote that book when he was very new to agile and Scrum. He’s absolutely one of the best thinkers in the entire agile world and is someone whose every comment I take seriously, but he was new to agile back then and his opinion may be quite different these days. I’ll email him and see if he cares to comment on this here.
I can’t comment on page 26 of his book as I don’t have it with me and usually make it a point not to comment on the work of others here. (If I have comments on that, I usually leave it on their own blogs.) But, in general, if velocity is calculated based on points that are derived two different ways then velocity should be different. Note, though, that in “Agile Estimating and Planning,” I called velocity “the great equalizer” because this should normalize over time (assuming a consistent approach is used). In my experience, however, defining points = ideal days will result in greater variability (standard deviation) of velocity depending on the influence of each person on each estimates. (Since each person is really estimating their own ideal time, the estimate given a product backlog item will be less consistent as some people are more influential on some stories than they are on others.)

Henrik Kniberg said…

Spot on, Mike. Indeed, the book is 8 years old and it was our beginner’s journey into agile estimating and planning (ironic, considering how many ppl use it as reference material even today). A lot of the patterns in the book are still valid today, but the part about ideal time, focus factor, and all that - well that’s one part I’d love to just rip out of the book 😊
At the story level, I almost never bother with time estimates, because it’s going to be wrong and cause confusion.  Use story points as a pure unit of relative size or effort. Then use velocity per sprint to aggregate and connect it to time.

mikewcohn said…

Thanks for chiming in, Henrik. We all appreciate it.

Stijn Hoeke said…

Thanks Mike just wanted to hear your thoughts on this to be sure I didn’t miss anything. And thanks Henrik for replying as well.

mikewcohn said…

You’re welcome, Stijn.

Marius George said…

Hi Mike,
You mention so many times in the article and comments that story points somehow should involve thinking “how long”. And this is simply not correct. In your example above - all that the two guys need to get good at, is estimating how far the buildings are away from them. After a few goes, they will get quite good at working out that this building is one unit away, and that one is two units away. A really far away building might be a bit harder to work out, so they might give it a 13. We are measuring size… in this case distance. Both people estimate more or less the same distance, even though the guy with the crutches will take much longer to get there. After walking to several buildings over a few weeks, they will learn that as a team, they cover N units of distance (the size measurement) over a fixed time period. During this time, the crutches guy will have covered a lot less distance than the guy without crutches. The team now has a velocity - as a team, they cover 5 distance units per sprint. When they plan the next sprint, they can plan for 5 distance units in that time box. The 5… IE story points, is ONLY size, not time. Crutches guy would have covered 1 distance unit, and other guy 4.

mikewcohn said…

Marius—
And on a software project what do you suggest two individuals use as that proxy for size? Lines of Code?

Marius George said…

I talked about this today in the office, and I’m not sure I have a clear answer yet. I used a few different analogies to distinguish between size vs. time based estimation. One was shifting piles of gravel. If we have a big guy and a little guy, and we ask each one how long it would take them to shift a pile of rubble of a certain size, they would clearly give very different answers. However, if we ask them to rate the pile of rubble in size, they would hopefully give the same answer, and then get on with shifting the pile… with the huge guy completing the majority of the work. The other example I used was “walking to buildings”, where we shift from asking… “how long would this take you”, to “how far is this building”. This is all simple to understand, but it seems that it starts to break down a bit when we try to apply the same to software. Maybe thinking simultaneously of lines of code, complexity of the problem, and so on, would work alright. If I visual a certain feature as one or two small functions in some class vs. writing a few classes with several methods and some integration with other classes - then it should be possible to think of these things without bringing time into it… but it’s very hard not to instantly thing in time units. My single biggest issue with thinking in time units, is that time is time… and you only have N man-days in one sprint, and thus the time units will just fill up to some constant value and the velocity cannot increase on a nice graph over several months, for example if the team gets faster and faster at doing the same thing. I don’t have complete clarity on this yet, but I need the maths to make sense so that velocity does not tie in to time units directly.

Marius George said…

Allow me to elaborate on the pitfalls of time-derived Story Point estimations.
Scenario…
Let one sprint be 5 days and let’s imagine that each member in our
team works a full productive 8 hours per day, giving us 80 man-hours
in a sprint.
Big Guy (BG) and Small Guy (SG) must shift piles of rubble from A to B.
There are various piles of various sizes. Let’s look at the first
two piles: Pile X is 3 times as large as Pile Y.
BG and SG cannot agree on how long it would take to shift Pile X,
because BG thinks it is 3 hours, and SG thinks it is 6 hours.
So instead, they agree that Pile X is 3 Story Points.
They look at Pily Y, and BG thinks that will take him 1 hour.
SG thinks Pile Y will take him 2 hours.
Relatively speaking, they both agree that Pile Y is 1 Story Point.
————————————————
Side note:
We can now calculate that BG works at a rate of 1 Story Point per hour.
SG works at a rate of 0.5 Story Points per hour. We know this, because
they each have a private mapping of hours to Story Points.
————————————————
They get on with shifting piles of rubble. At the end of the sprint,
they look back and, given that their (private) time estimates and thus
public Story Point estimates were spot on, they shifted 60 Story Points
worth of rubble. BG shifted 40 and SG shifted 20. Their “velocity” is 60.
All the maths check out so far.
When SG looked at Pile X, he thought it would take him twice as long as it would take BG.
We can see this in the 40 vs. 20 Story Points completed by each.
We can see this also in the rate of work of 1SP/h (BG) compared to 0.5SP/h (SG).
So what’s the problem?
The problem is that velocity is pretty much derived from time. It will always be 60.
No matter how fast BG works next month compared to this month, he will ALWAYS map 1
hour to 1 Story Point, so instead of velocity going up as he gets faster and faster
at shifting rubble… he will simply estimate similar piles of rubble as less and
less Story Points over time.
Let’s continue with the scenario to demonstrate this:
BG and SG have shifted rubble for a whole week. They have completed 60 Story Points.
Over the weekend, BG goes on a Rubble Shifting Techniques course, and learns that by
using a particular spade, holding it in a certain way, and taking frequent breaks, he
can now shift rubble at twice the speed as last week.
Monday morning…
BG and SG looks at some new piles of rubble. They both look at Pile Z. They can see that
Pile Z is the same size as last week’s Pile X. SG says Pile Z is 3 Story Points.
BG however, now looks at Pile Z and thinks…
“Hey, this will take me 1.5 hours. Using my same formula from last week, this is more like
a 1 Story Point or 2 Story Point rubble pile now”.
And so it all goes wrong, because BG and SG are NOT simply judging the Size of the piles to
agree on Story Points… instead they are thinking in Time terms to arrive a Story Points.
They now no longer can agree that Pile Z is 3 Story Points, even though its exactly the same
size as last week’s Pile X.
What BG should do in week 2, is judge Pile Z to be the same as Pile X from last week. In other
words, 3 Story Points. He would now complete Pile Z in Less Time, and thus the velocity of the
team would go up as you expect… given that BG went on the fancy course to speed up his rate
of work.
But why would BG give Pile Z a 3, if we trained BG to think about time and to map the time to
a Story Point value?
This is the danger of brining time into the equation. Piles of the SAME SIZE would be judged
at different Story Points over time, and Velocity becomes a useless constant value.

mikewcohn said…

Hi Marius—
Thanks for the detailed example. I’ve used many of the same examples, including the rocks example in a talk on this that I gave at Google back in 2007. It’s on YouTube at http://youtu.be/fb9Rzyi8b90
The problem with your example is when you say BG “will always map 1 hour to 1 point.”
Yes, that’s a problem and that is NOT how story points should be assigned. When BG goes to estimate something a sprint after the first example, he should look at the pile of rocks and think, “Hmm, that’s the same size as Pile X. And, last time I called Pile X three points so it’s three points this time.”
BG will do that whether (a) he is now stronger from a full sprint of moving rocks; (b) has a bad back from a sprint of moving rocks; or (c) is exactly the same as a sprint before.
The key with points is that they are to be estimated relatively each time. Presumably BG gets better at moving the rocks each sprint (or he would in the underlying software analogy, presumably) and so what changes in cases (a) and (b) would be his velocity. (Not that we really want to measure individual velocities.)

Marius George said…

I have through about this a bit more… so here is a question: If I ask you “How long will this take”, and you say “4 hours”. Why did you say “4 hours”. The answer is, you arrived at 4 hours by measuring a bunch of other things. 4 hours is not the “proxy for size” in your question above… it is a proxy of a proxy for size. So really more thought is needed about why someone arrives at the answer “4 hours”. If you investigate where the 4 hours came from, you will know what has truly been measured (the size based estimation at the core of it.). So taking this to the piles of gravel example… if Small Guy said it would take him 1 day… he probably said that because he is looking at the size of the pile, and he knows he is a small guy. If you tell me to implement this feature would take you 4 hours, you probably said that because you know you are familiar with this kind of feature, you also know that the feature will involve writing one or two small-ish functions with a bit of configuration attached to it, and so on. It is this underlying set of measurements that should map to the Story Points estimation, and this same underlying set of measurements arrive at a time estimate FOR YOU. It should not be the other way around where the time estimate is ever mapped to Story Points directly, EVEN if it’s the first time that you move a pile of rubble.

mikewcohn said…

Marius—
I coach individuals on teams to “think in hours but speak in points.” So, someone I’d coached wouldn’t say, “four hours.” They would say, “it’s like that other one so it gets whatever points we put on it.”

Badru said…

Great post.

David Geissler said…

Mike: I too like the examples but I also like to differentiate between complexity and effort so that we can adapt from the experience.
In one example, you are on crutches meaning that the level of effort takes longer. In a perfect world scenario your effort would have been minimal (1-2pts). However, since you are partnering with the other person the effort is shared. Presumably, you want to arrive together. Her effort increases as she supports you which subsequently inflates her estimate (perhaps 3pts). When you perform this again without crutches both your effort will be reduced back to a 1-2 pts.
Let’s make the parallel with development. Her engineering knowledge is a higher level than yours and she is able to complete this story quicker than you. However, you’re new to the team and you need to gain understanding of how to program this type of story. The effort for you will be substantially higher if you were to take this on yourself. However, if you partner then the effort for you will be supported by the effort from the partner subsequently lowering your estimation but increasing hers. If she did this herself, the effort would be small but the benefits of pairing to increase your efficiency would be lost.
To you, in this case, the effort and the complexity are both high.
To her, in this case, the effort and complexity are both low.
To both of you, there’s a middle ground.
There are three points I use in measuring complexity:
1. How hard is it to describe? - Basic user story. The more vague the increased complexity
2. How hard is it to create? - Code complexity - Knowledge of team. Less familiar more complex
3. What is the degree of organziation? - Visualizing the product (ie generate schema) - Does a visual artifact reduce complexity?
Using this measurement, I can task out the effort (using time) and determine where specifically the complexity lies (using points). ie The third building is farther way and more circuitous and Mike has to use crutches to complete his task.
Task 1 - first building - 10 mins - Blocker = Crutches
Task 2 - second building - 20 mins -Blocker = Crutches
Task 3 - third building - 40 mins -  Blocker =  Crutches and circuitous path
In this case, the crutches are clearly a blocker (aka limited understanding of code) to delivering efficiently. The circuitous path may appear to be a blocker but it is something that cannot be changed. Analyzing the blocker helps Mike and his partner reduce complexity and in this case level of effort for future story development.
The next time they walk to any of the buildings the level of effort and complexity to buildings 1 and 2 will be relatively low while the measure of complexity increases for building 3 while the level of effort lows due to the familiarity of the task.
Hope that wasn’t too long winded.
Cheers,
-d

Mike Cohn said…

Hi David—
I think I follow your example. One issue, though with going too far with this example of walking between buildings is that it is really a task and done by one person, unlike a user story, which would typically be done by perhaps 3-4 people on a project, as in your example of two people helping one another. I want to be really careful to avoid ever estimating complexity for its own sake.

lm said…

but effort will vary by developer, while complexity won’t. so complexity is a more robust metric. could you please provide a more realistic example in addition to the extreme illustrative example? if two developers have different strengths and knowledge do you always assume a story will be picked up by a particular developer at the point of planning? if not, then you can’t know the effort required at the outset for the reasons you illustrated.

Mike Cohn said…

Not true. If effort is estimated relatively it will not vary.
And complexity is a useless measure. I have never once been asked by a boss, client or customer, “I don’t care how long this will take, but how hard will you need to think? How complex is this?”  Complexity is only a factor to the extent it affects how long something will take.

lm said…

effort won’t vary by developer? but your dr/child example seems to argue the opposite. i.e. that the effort estimate only makes sense if you “choose the right person for the job”. which means it would otherwise vary by developer. also, “x points of complexity” is no more or less meaningless than “x points of effort” to a boss, client or customer without explanation.
in my experience (10 years in scrum-ish agile) non-technical stakeholders are rarely exposed to the concept of story points and when they are they have difficulty relating the concept to time estimates. but we still find points very useful for our own internal estimating.  i don’t think points should really be intended for clients and customers.

Mike Cohn said…

Relative effort will not vary. And, yes, I said to assume the right person for the job. I’ve also said these examples degrade when taken too far because we are using *tasks* as examples and story points should be put on stories. (Stories being done by multiple people, tasks being done by one.)

Mike Cohn said…

Relative effort will not vary. And, yes, I said to assume the right person for the job. I’ve also said these examples degrade when taken too far because we are using *tasks* as examples and story points should be put on stories. (Stories being done by multiple people, tasks being done by one.)

lm said…

relative effort would vary if the two developers have different, complementary strengths. if i am fluent in python but not java and my teammate is fluent in java but not python then the relative effort for two stories, one entirely requiring python and one entirely requiring java, would be reversed depending on developer. if you’re talking about effort given a specific developer, then i agree with you.

John G said…

Correction to the last point, for other confused readers: I think you mean to say the biggest issue with function points is that they are NOT very easily derived in advance.  So true.

Mike Cohn said…

Oops! Thanks, John. You’re right.

Daryl Antony said…

I suppose Effort is correlated as Time?  I wonder then, instead of something that is Hard, can we think of Complexity as a measure of Work over Time?
Where it gets more interesting, is that if we decouple Complexity from any form of Time, however, using tasks that we’ve done prior as a means to estimate the Complexity of tasks that we will do; i.e. User Can Login, might be 3 pts — which might be similar in Complexity to the task we have at hand — we’re then able to correlate some time data points about “how long” it took to achieve a task of Complexity 3 pts.
From here, we have a framework to let the stakeholders know about the cost / effort estimations they care about; whereas the development team can still collaboratively remain in a construct that makes more sense to estimate — “is this task more or less complex that the prior calibration task that we determined to be a 3?”

Mike Cohn said…

The danger with anything that is too disconnected from time is that we could be estimating something no one cares about. Bosses, clients and customers want to know how long something will take. So, ultimately whatever we estimate in needs to be converted into a duration. Estimating purely complexity (for reasons I’ve listed here and in other posts on this subject) cannot do that as complexity is only one factor in how long something will take.

Daryl Antony said…

This is completely contradicting.
Complexity is a measure of effort, work & risk.  Relative sizing.
It shouldn’t be correlated to a linear time-scale.
If complexity is high, risk is high, and therefore time estimation is prone to be highly inaccurate.
If complexity is low, risk is low.  Comparatively, time estimation is probably going to be more accurate.
Achieving Work / Complexity over Time is measured by Velocity.
The development team, over the course of 1 or more sprints will establish a Velocity, i.e. the amount of Complexity Points they can achieve, over time.
This is your very chance to communicate to Bosses, clients, customers the “when” and “how long” something will take.
And if you can calibrate amongst the team, a consistent idea of Complexity; and that team works at a consistent pace, sprint by sprint — your ability to use the historical data collected; as a means to predict the future, is increased.
This is your opportunity; if you _do not_ confuse Complexity points with Time / Effort based estimation.
Otherwise you’re simple creating an abstraction of Time / Effort — and bastardising it as Complexity / Story pts.

Daryl Antony said…

The task hasn’t changed; you’re right — if the team or the technology changes, the team’s ability to achieve Story Points over Time changes and is measured in Velocity.
Complexity is often measured with given constraints; i.e. “given that the team will use a certain technology; what is the complexity” — Brian the Brain Surgeon _may_ have anchored the task complexity, given that he would use a certain technology or techniques.
Perhaps its important here for the team to recognise the presence of such anchors / constraints.

Aneldi Renasaou said…

Hi Mike,
I have read the post and the comments and I just want to check if the following conclusions of mine are correct:
1. We estimate in SPs which are a relative measure of effort
2 Effort is a resultant of complexity AND/OR risk AND/OR amount of work
3. If I have a new team (unknown Velocity) they cannot estimate how much time they will need to complete what they agreed as being a task of 3 SPs.
4. Only after I find out their Velocity, I can say, “ok, if you did 30 SPs / sprint on an average, you can estimate those 3 SPs to 1 day”
However it happens to have 2 tasks estimated at 5 SPs and at the end to realize that the team spent very different time on each - 4h vs 16h. So what should my approach be in this case?

Mike Cohn said…

Hi Aneldi—
1-3: Yes.
4: Yes, but you want to be very careful with doing that. That will lead to individuals thinking “Hmm, 3 story points equals 1 day” and while that is true *on average* for your team it will not be true for each individual on the team.
As for your ending question: There is no *exact equivalence* such as “3 story points = 1 day.” The relationship between points and hours is exactly that—a relationship, not an equivalence. On average for your team, you’ll find that one point equals an average of n hours plus or minus s hours. (Actually I don’t recommend you really track it enough to know this. But in theory you would know that.) The point is that one point is not always exactly n hours. One point is a range of hours.
For an example of a team that did track this (and avoided introducing the dysfunctionalities that normally come when tracking it) see https://www.mountaingoatsoftwa… <https: www.mountaingoatsoftware.com=”” blog=”” seeing-how-well-a-teams-story-points-align-from-one-to-eight=”“>
For what to do when you find one story is way out of whack (e.g., more than perhaps a standard deviation or two away) see any of the posts I’ve written here already on re-estimating, or my book which covers it or my video course which also covers it.

Aneldi Renasaou said…

Mike, thank you very much for your clarification! This helped me a lot.

Mike Cohn said…

Thanks for letting me know. I’m glad it helped.

Chithra said…

Hi Mike, there always seems to be a little problem when it comes to how developers and testers perceive the effort required.  One of the benefits of agile is the frequent incremental integration and some level of regression testing even at every stage. But the down side seems to be that essentialy test effort is based on functional requirement whereas for development teams technical architectural, platform, language etc drive the effort besides functionality, In some cases Unit testing checks basics and that is all. Ofcourse test teams are involved very early ( maybe from design stage) so there is some common understanding. They participlate in poker sessions as well, but is there any way to correlate Integration and Interface test effort with devlopment effort for a given number of story points in a release( Assuming there is aleady a well established baseline productivity for development per story point)?  Any insight on this would be helpful not only for agile but any SDLC Model..

Mike Cohn said…

I’m not sure I understand the question. The integration is through discussion. Because the common ground is effort, all roles involved are able to share in the the discussion.