(pwrSumAvg * n – n * mean * mean) / (n – 1)

with

(float(n)/(n-1))*(pwrSumAvg-mean*mean)

to avoid taking the difference of numbers whose values scale like n.

]]>Using Welford method which you refer to, yields a fast ever-increasing “M” whereas the method described in this post holds a “pwrSumAvg” that is much more stable. Instead of shifting values by 1E15, try multiplying by 1E15 and you will see the difference between the “M” and “pwrSumAvg” intermediate parameters.

Now push that difference for 1E15 number of points and the Welford method cannot hold its “M” value in a single 64bit number and will shift around in a catastrophic failure.

]]>How normal does a distribution have to be before someone should be allowed to use this algorithm? The answer will always be situational.

]]>