Sunday, December 11, 2011

Best guessing

This weeks S-word article is about averages. Arithmetic mean, geometric mean, harmonic mean, median and mode. This isn’t even going into maximum likelihood estimates or least squared estimates! All of these things are our best guess at a single summary statistic for the data. If we had to pick one number that represented the data, what would it be? There are all kinds of criteria for judging whether an estimate is any good or not. How does it behave as the sample size gets larger or does it have a tendency to be too high or too low (relative to a known measure like the population mean). Statisticians talk of BLUE (best linear unbiased estimate) and consistency (as the size of the data set gets bigger then our certainty in the number we’re getting out should increase too). The `take home message’ here is that there is no single best estimate of centrality but rather the estimate best for a particular situation could be any of these mentioned. In some cases there may not be much difference between the values anyway to there’s not much to worry about. In other cases, we need to be a bit more thoughtful and choose our estimator with care.

No comments:

Post a Comment