A common criticism of story points is that the meaning of a story point will differ among teams. In this post I want to describe how can we establish a common definition of a story point across multiple teams within an organization.
The best way I've found to do this is to bring a broad group of individuals representing various teams together and have them estimate a dozen or so product backlog items (ideally in the form of user stories in my opinion). Not each estimator needs to understand every item but most people should understand most items. The items being estimated do not need to be new items; some could be from a project finished recently that many estimators remember or worked on. Some items could be artificial; perhaps the team is asked to estimate, “a typical transaction activity report.” If that meant something to most estimators, it would be a good candidate item. I've done with this 46 people in a large conference room–44 estimators plus me and a coach from my client who wanted to watch so he could moderate such a meeting the next time one would be needed. The 44 estimators represented 22 teams; two estimators per team were in the meeting.
If you've seen or used the Mountain Goat planning poker cards, you'll have noticed that they feature a very large number in the middle (plus the number in a smaller font in the corners). We could have done something cute like put eight little goats on the eight card. We put the very large number there deliberately, though: We wanted it to be visible across a potentially large conference room. You can probably imagine how difficult it might be to gain consensus among 46 people playing planning poker. While it will not take proportionately longer to derive estimates, it does take quite awhile with that many people. I think it took us about two hours to estimate twelve items. But when that meeting was over, each pair of estimators went back to their teams with twelve estimates. Those estimates could then be used as the basis for estimating future work. As each team estimated new product backlog items they would do so by comparing them to the initial 12 plus any estimates that had been produced since (by them or any other team).
I'll blog next about when it may or may not be a good idea to establish such a common baseline.