Formulas for calculating things can be magical. You can find a lot of them in Numerical Recipes, for instance. Trying to make sense of some of the quirks in a formula – the adjustments – can be a bit of a challenge. Amar describes Making sense of standard deviation as making sense of two adjustments. One is the use of the root of the mean of the squares (RMS) – often used in AC circuits as well as in the standard deviation – and another is a sample size adjustment.
One reason given for calculating RMS values is to avoid having negative numbers cancel out positive ones. But you could do that by just using absolute values. What RMS does is to give weight to the larger values. That means larger deviations will have more impact than small ones in standard of deviation calculation. This is also why you take the root after you have normalized the deviation squares.
For instance, take a sample of 5, each with a deviation of 2 (or -2). Square them, add the squares (=20), divide by 5, and you get 4 to take the root of. That’s the same value as each of the deviations so we don’t get anything by this RMS calculation. The deviation of all the samples is 2 and the average is also 2.
Now, make one sample have a deviation of, say, 6. That’ll push the sum of the squares to 4+4+4+4+36=52 instead of 20. Divide by 5 and you get 10.4 rather than 4. Square root of that is 3.2. That means that the one oddball sample made a difference of more than 50% in this deviation averaging calculation. That difference is telling you that you have some weird deviation in the sample and that is something you want to know when doing a statistical analysis.
The other adjustment in the formula to make for a “standard” deviation has to do with sample size. This centers on the idea that statistics give better results the larger the sample size. If the set of samples has what is called a ‘normal’ distribution, you can adjust for only doing calculations on a subset by adjusting the sample count used in the calculation. This one is a bit tougher to visualize but the mathematics, given the normal distribution assumption, do show why it is necessary.
Post a Comment