A county fair in England in the early 20th century held a contest to guess the weight of an ox. The person with the closest guess was awarded a prize. The statistician Francis Galton did analysis of the guesses and found that the average (mean) guess was not only better than the best guess, it was within one pound of the actual weight (1,198 lbs.). Another average that Galton calculated was the median. It was also close, but not as close as the mean. The median was nine pounds higher than the actual weight. For those not familiar with the distinction, the mean is the sum of all the numbers in a sample divided by the number of numbers. The median is the middle number. For example, the mean of 1,2, and 6 is equal to (1+2+6)/3 = 3. The median is 2.
This example illustrates the power of using multiple methods in prediction. Whether in weather prediction or the forecasting of cost for weapon systems, using more than one model has its merits. This is often referred to as The Wisdom of Crowds, after a best-selling book on the subject.
This past weekend I went to a friend’s birthday party. He is an avid fisherman. In celebration of a milestone birthday, his family organized a party with a decoration theme that celebrated his love of fishing. One of these was a mason jar filled with multi-colored fish-shaped snack crackers. People at the party could guess the number of crackers in the jar. The person with the closest guess was awarded a $25 Amazon gift card.
I entered a guess of 800. The actual number was 827 and the closest guess was, amazingly enough, 825. I analyzed the entries after the contest. It turns out my guess was tied for second-best. The next closest was 875. There were 25 entries in all that ranged from a low of 2 to a high of 10,000. The mean guess was 1,368 and the median was 800. The median was a good predictor – only one person’s guess was better than the median. But the mean was a better predictor than only five individual guesses. The mean was skewed higher by the guess of 10,000 and another guess of 5,120.
A sample of 25 is small compared to the number of people who entered the weight guessing contest in England. There were 800 entrants in that contest. When you have large samples, the few outliers do not have as large an influence on the mean. This illustrates the challenge inherent in small data sets. The Wisdom of Crowds Effect works better when you have a large number of guesses. Also, with a small number of guesses, the median is a better predictor than the mean.
This illustrates an issue with small data sets. There is a tendency to think that small data sets behave the same way as large data sets. Many times in the aerospace and weapon systems business, there are only a few directly relevant data points. For example, the Missile Defense Agency uses a hit-to-kill technology that involves launching a small, highly maneuverable payload called a kill vehicle that is designed to hit enemy payloads in either high in the atmosphere or even outside of it. Only a handful of these systems have been developed in the past. This makes analysis more difficult, to the point where statistical analysis of such systems, such as forecasting the cost of a new proposed kill vehicle concept can be a guessing game of its own. There are some ways to cope with small data sets, but I will save that for another blog post.
Like your blog! I recently studied the three ways to compute an average: arithmetic mean, the geometric mean and the harmonic mean. Are you aware of any studies about these three means with small samples? 😀
Thanks Bob! Good to hear from you. The harmonic mean always tends to the lowest value, so it is influenced by outliers on the low side. If your data are not subject to that, it might be a good choice. The arithmetic mean is the most sensitive to outliers. The geometric mean is the most robust to outliers. It is similar to the median in some ways, and for a lognormal distribution, the geometric mean is the same as the median. The median is completely insensitive to outliers but in many cases (if data are not lognormal), the geometric mean will be influenced a little by outliers but not as much as the others. For example, suppose a phenomenon mostly varies between 1 and 3, and you observe four values 1,2,3, and 10. The arithmetic mean is 4. The median is 2.5. The harmonic mean is 2.07, and the geometric mean is 2.78. The harmonic and geometric means are closer to what you expect to see as the central tendency and are not heavily influenced by the outlier on the high side.
Comments are closed.