Did you ever had to solve a homework problem, and you used more than one way to solve it? As long as both methods yielded the same result, you would think this would give you more confidence in your answer. Especially in math. Speaking of which, I had a math professor in graduate school who did not want us to do this for assignments. He thought that using multiple methods meant that we did not have enough confidence in any single approach. This biased me towards the use of a single best method in my work in cost estimating. However, in studying machine learning, I discovered that research had shown that using multiple methods can improve predictive accuracy. Typically, when you make a model too complex, it does not predict well in practice. However, the more models you use, the more complex the overall prediction engine, but the predictions do better in practice than single models used by themselves. This was a revelation to me.
The use of multiple techniques for prediction is called the ensemble approach. An ensemble is a group of items viewed as a whole rather than individually. Suppose we have multiple models from which we would like to choose the “best”. The models could be constructed using different datasets, methods, variables or equation forms. An early example of ensemble predictions is a competition to guess the weight of an ox at an English county fair in 1906. Eight hundred people entered the contest. The statistician Francis Galton (he coined the term regression) was interested in the results, and thought that the average, or mean, result, would be far from the actual weight. He was surprised to discover that the actual weight of 1,197 pounds was only one pound from the mean of the 800 submissions.
More recently, the data science company Kaggle started hosting competitions to solve problems requiring prediction. The best known of these is the $1 million Netflix prize, where people submitted models to improve the company’s recommendation system. The catch with Kaggle submissions is that you get a score on how you do, but you cannot see the actuals to see the error of each estimate. Ensemble techniques have consistently proven to be crucial to winning submissions to these competitions, include the Netflix prize. In that particular competition, two teams tied for first. The tie breaker went to the team who submitted first. The first-place finisher submitted twenty minutes before the second-place finisher and won the entire $1 million prize.
Ensembles are prominent in weather forecasting, especially in storm prediction. Whenever you see a tropical storm or a hurricane discussed in a weather forecast, there will be predicted plot for several different models. Most will cluster near a common path, but occasionally one or two may be very different. The true path will likely be much closer to the path most of the models fall rather than the outliers. Using multiple models helps to avoid being influenced by such predictions.
The idea is counter intuitive. In most applications, you should expect that the best is better than the average. You would not expect two mediocre athletes to perform better on average than a superstar. However, that is what happens with models. The average of several models, some of which may be good, and others that may be mediocre, will on average perform better than the best model. Studies have shown the improvement ranges between five and 30 percent.
In “The Wisdom of Crowds” by James Surowiecki, the concept of the crowd approach is discussed. When you put together a large enough and diverse enough group of people and ask them to make decisions affecting matters of general interest, that group’s decisions will be intellectually superior to the isolated individual. Applying this concept to cost estimating, we can determine that in the right circumstances, an average forecast is better than a single forecast.
There are many ways to combine estimates. A simple average is the most straightforward approach. It’s also possible to calculate a weighted average of the models to do even better. Regardless of which method you choose, when making a forecast or prediction, use multiple methods whenever you can.