The AwkEng Estimates Things

by Sam Feller October 04, 2023

Hi all!

Today's post is about an estimation technique that I like. I'm not sure this particularly variation has a name, so I'll call it Wideband Delphi with bounded estimates. It uses some "Planning Poker" style wisdom of the crowds techniques to estimate both the "likely" and "pessimistic" durations for an activity. In practice, I find that it's fast, engineers are comfortable with it, and the bounded estimates that it generates lead to good discussions over the nature of schedule risk. (And the estimates, in my limited, anecdotal experience, have been pretty good.) I'll share up front how to put it in to practice, then talk about why it works.

Wideband Delphi with Bounded Estimates

Here's how the process works.

Start with an item to estimate. (Presumably, you have a list of tasks or deliverables, you'll repeat this process for each item on the list.) Discuss what the item is and allow people to ask questions.
Ask each engineer to independently (and secretly!) provide both their likely and pessimistic estimates for how long the task will take. You can use Fibonacci sequence estimates (1,2,3,5,8,13...) but for wideband Delphi with bounded estimates, I prefer days or weeks.
Reveal everyone's estimates.
If you haven't achieved convergence on the first round, ask the engineers at the extremes of both low and high bound estimates to explain their rationale.
Have the engineers independently create new estimates.
Reveal the estimates again. You can repeat steps 4 and 5 if needed. It's ok if estimates haven't converged completely, just take the median values

And that's it! I've found that groups can converge quite quickly, even on difficult questions. For example, I've seen teams estimate things like Linux kernel updates, where they all agree that the likely case is 1-2 weeks, but even if things go very, very, wrong, it won't be worse than 8 weeks. And that's just on the first round of estimation.

The power of this technique comes from chaining together multiple estimates and developing an understanding of the uncertainty in a project, which in turn leads to discussions of how to drive that uncertainty out. Those conversations are incredibly important for stakeholder management.

Of course, sometimes people like having hard dates, but this can help pick that hard date with an understanding of where you are between the low and high bound estimates. Pushing for an earlier date comes with the risk of outright slipping it, or needing to put in a lot of extra hours to make it happen. But now you can make some of those risk based decisions up front, with a data informed discussion, rather than arbitrary "gut feel".

More on Why This Works

I think one of the reasons this works so well, is that Notoriously Squirrely Engineers are much more comfortable providing likely and pessimistic estimates than giving a single hard number. There are strange incentives when putting together single number estimates, either to pad the date to create wiggle room, or to lowball it to make a particular path look easier and less costly.

The real world is much messier than "how long will this take", and the bounded estimate creates room for that, while also acknowledging the practicality of needing dates to be able to make good decisions and track progress.

Event Durations Are Log-Normal

I think the best model to capture that messiness is called the log-normal distribution. Most people have heard of the bell curve. Well, if you do some math, take the logarithm of the bell curve (or the normal distribution), you get the log-normal distribution. What's neat, is that on the left side, it's bounded to zero, which corresponds well to real world events that can't happen faster than no time at all. In the middle is the "likely" case, and towards the right is the long tail of things that seemingly come out of nowhere and take forever.

Physical things like trip durations, height/weight distributions, and yes, task durations, tend to fit a log-normal curve better than a bell curve.

normal vs log normal

The idea here is to try and capture the 50th percentile "likely" case, and the 90th percentile "pessimistic" case with the two estimate bounds. You don't need to capture it perfectly, you just need to get close enough to have an informed discussion.

The Oracle of Delphi, aka Planning Poker

I'll be repeating internet myths if I explain exactly how the Oracle of Delphi technique was first developed, but the idea is that the averaged wisdom of the crowds tends to be more accurate than any one individual. The Planning Poker style of simultaneously revealing estimates allows for quick estimation without letting any one individual bias the crowd ahead of time.

Even when final dates haven't been important, I've found Planning Poker and estimation valuable for driving out misunderstanding when two estimators wildly diverge, i.e. "what do you mean, that's an 8, i thought that was a 2?" - "well, you said it needed feature Z, if we only did X and Y, then yeah, it's a 2" - "oh, I didn't know Z was that hard, let's scrap it, then."

Teaser: Event Sequencing

I won't get too in depth in this blog post, but there are Critical Chain planning techniques that use the low bound estimates to create a schedule, then use the high bound estimates to create a buffer. The schedule tends to always slip, which is expected, because it's built against the "things going well" likely estimates. What's cool, is that if you use too much schedule buffer too soon, it creates an early signal if the project is running late, which tends to eliminate "we'll make it up later" mentalities.

Calculating what the buffer should be and how the "pessimistic" cases blend together (after all, not everything will go wrong), is the next level of applying this technique. If you're really fancy, you can do some monte carlo simulation, or if you're less fancy, like me, you can use root sum of squares (RSS). Again, the idea here is to drive good discussions.

likely and pessimistic

best case worst case rss case

schedule slip

There are a few proprietary software packages used by consultants that do this sort of modeling, I've made do with spreadsheets for the calculations and sort of hacked together more common tools like Jira/Asana/ClickUp etc... using the low bound estimates, sliding the schedule manually when slip happens, and keeping a placeholder buffer that doesn't move as a separate line item. I don't know why these tools aren't built into current products... maybe they're more common to manufacturing operations and less well known, or maybe they're just too complicated to add.

That's all for today!

Best regards
Sam Feller
aka THE Awkward Engineer