PMBOK standard deviation formula is most often wrong!
“Lies, damned lies, and statistics”
British Prime Minister Benjamin Disraeli
You might well ask what is “Standard Deviation.” Sounds like a regular deviant? And what could it possibly have to do with managing our projects, given we’ve all no doubt managed projects successfully without any knowledge of this peculiar statistical thing involving probability theory.
A statistics buff would tell us that Standard Deviation ia about variation from the Mean. In the project context Mean or Best Estimated Time (BET) is a weighted average that is best calculated using the Program Evaluation and Review Technique (PERT) formula. A low Standard Deviation indicates that pessimistic and optimistic durations are very close to the Mean – see the blue curve in the diagram below. A high Standard Deviation indicates that these estimates of duration are spread out over a large range – see the red curve. Standard Deviation is usually represented by the symbol of Sigma, σ. Basically, the Mean shows the height of the curve (central tendency), and the Standard Deviation determines the width of the curve (dispersion) and the greater this is the less certain is our estimate of project duration (or any other variable).
Risky and volatile projects have a higher Standard Deviation in terms of their cost and duration. Uncertainty or Standard Deviation is a quantification of doubt or spread about a measurement. In fact, statistics is essentially the study of uncertainty. For example, we might forecast that our project task duration will be 20 days plus 2 days or minus 1 day, at a 95% level of confidence. As a ‘rule of thumb’, roughly two thirds of all readings will fall between plus and minus (±) one Standard Deviation of the average. Roughly 95% of all readings will fall within two standard deviations. This ‘rule’ applies widely, although it is by no means universal.
The PMBOK (Project Management Institute’s bible or Body Of Knowledge), tells us that we can determine a project’s or a project task’s Standard Deviation by applying the following simple formula:
SD (σ) = (P – O)/6
P is the Pessimistic duration when things very occasionally go really wrong and O is the Optimistic duration when things very occasionally go very well. For example, if P = 25 days and O = 10 days, then by PMBOK reckoning the SD = (25 – 10)/6 = 2.5 days. Any person with a Six Sigma background would take a look at this formula and laugh. The PMBOK formula assumes a symetrical bell-shaped curve or normal (Gaussian) distribution, where if we consider durations, the distribution suggests that there is a 50% chance that the project will take less time than the mean and a 50% chance that it take longer. While there is no such thing as an exact estimate of project duration, an accurate estimate is one where there is both an equal likelihood of the project coming in late as there is of it coming in early.
However, the PMBOK formula seems simplistic when we consider that a normal distribution is seldom true for project management durations where a positively skewed beta frequency distribution is much more common, given that there is a limit to how quickly we can complete a project, but virtually no limit to how long the same project might take to complete. The resultant skewed distribution does not possess the characteristics of the normal curve.
Given the above diagram that shows a beta distribution, the Most Likely (M) time in which to complete this project is 13.75 days, our optimistic (O) or best case prediction is 10 days, and our pessimistic (P) or worst case prediction is 25 days. If we applied the (P – O)/6 formula, the Standard Deviation would be 2.5 days, which result is evidentally good enough for PMI exam purposes. However, the true Standard Deviation for this distribution is 7.81 days, given by the following formula (keeping in mind BODMAS the sequence in which to complete mathematical functions):
SD = √ [(O-E)² + 4(M-E)² + (P-E)²]/6, where E = (O + 4M + P)/6
SD = √ [(10-15)² + 4(13.75 -15)² + (25-15)²]/6 = 7.81 days
Rather than attempt to solve this the hard way, and we haven’t got a scientic calculator, here is a calculator that will do it for us:
We just enter the three critical numbers, which in this instance are 10,13.75, and 25. Phew – much easier. Also, the Mean (average) is 16.25, and Variance (SD squared) is 40.625.
The table below shows some Standard Deviations determined by the PMI recommended formula. However, the true Standard Deviations are Task A = 17.56, Task B = 24.01, Task C = 4.51, and Task D = 6.56, all rather different than the figures shown below:
In summary, in the diagram below the red Standard Deviation is quite large (SD = 10), meaning we should have less confidence in the estimate as there is a large range between the estimated optimistic and pessimistic durations. If the Standard Deviation is small (SD = 5) as for the blue distribution, we should be pretty confident in our estimate, since the optimistic and pessimistic durations are much closer.
Interestingly, the Central Limit Theorem (CLT), which is extremely important in the world of statistics, states that in relation to project management, the greater the number of critical tasks, then the greater the likelihood that the project critical path duration will approximate to a normal distribution. This is particularly true if the number of critical tasks exceeds 30. The point of the theorem is that no matter what the original distributions of the varioius critical task durations, the Mean of a large enough number of critical tasks will have a nearly normal distribution. The diagram below attempts to illustrate this theorem, but for a detailed explanation check here
This blog item may be of some interest, but if we are to tackle the PMP (Project Management Professional) exams, here’s all we need know:
- Three Point Estimate. This is simply the average of the pessimistic (P), most likely (M), and optimistic (O) estimates or (P + M + O)/3.
- PERT Formula. This provides us with a weighted average, which gives 4 times as much weight to the most likely or M estimate, where the formula is (P+ 4M + O)/6.
- Standard Deviation. The formula for the standard deviation for is (P – O)/6. We also need to know the following three figures for standard deviations: ±1σ = 68.27%, ±2σ = 95.45%, and ±3σ = 99.73%. If we have a “normal” distribution, and we try to calculate those values, which are plus or minus 1 sigma or standard deviation from the average (aka the Mean), we can be assured that 68% of the values fall within that range. Now in the exam, it will say “95.45%” and you are expected to know that that is 2 standard deviations, similarly with “68.27%” for 1 standard deviation and “99.73%” for 3 standard deviations or 3 sigma. Incidentally, the term “Six Sigma”, means striving for perfect quality such that 99.99966% of the products we manufacture are free from defects:
- Variance. This is the standard deviation squared, or [(P – O)/6]2 . This is used if we trying to get the estimate for the whole project. What we do is get the total of the estimates for all tasks along the critical path. Then we take the variance for each task, sum them up across the whole project, and take the square root to get the standard deviation for the project estimate: