The latest paper that caught my attention, that I thought I would comment on is (other publications I have commented are can be seen here).
Byungho Min, Jinhyuck Kim, Chongyoun Choe, Hyeonsang Eom and R.I. (Bob) McKay (2008) A compound framework for sports results prediction: A football case study, Knowledge Based Systems, 21(7), 551-562 (doi).
You might be able to download a copy of the paper from here. Note that this link may not be permanent and it may not be an exact copy of the paper that was published (although it does look like it).
The paper presents a framework which is designed to predict the outcome of football matches. They call their system FRES (Football Result Expert System).
The authors note that most previous research focuses on a single factor when predicting the outcome of a football match, and the main factor that is used usually the score data. Even when other factors are taken into account, the score tends to still dominate the prediction process.
Within FRES, two machine learning methodologies are utilised, a rule-based system and Bayesian networks. The paper describes how they are used within FRES in enough detail to allow readers to produce (as all good scientific papers should do) the system.
FRES is tested on the 2002 World Cup tournament. Most football prediction systems are tested on league competitions, where teams (typically) play a double round robin tournament. Testing their approach on a the 2002 World Cup means that the system cannot easily be compared to other systems. Where previous approaches have been tested on other tournaments (for example, previous World Cups) not all the data was available to enable FRES to make those predictions. In the words of the authors, “In the case of the few works which predict a tournament such as the World Cup, the available evaluation was conducted with old data, such as the World Cup 1994, 1998, which would unfairly hobble FRES, since some of the data it relies on are not available for these earlier tournaments.”
Although not a scientific term (at least not one I am familiar with!), I do like the term unfairly hobble.
In order to provide some sort of comparison with FRES, the paper implements two other systems, a historic predictor and a discounted historic predictor.
FRES was able to predict six countries out of the top eight in the tournament, The other predictors were able to predict five. Moreover, various statistical tests are conducted which confirms that FRES is statistically better than the other two methods.
One thing I like about the FRES system is that is has a lookahead mechanism. Based on this, England does not rate very highly as, due to the draw, there is a high probability that they will meet Brazil in the quarter finals. Turkey, on the other hand are rated more highly due to the perceived easier draw.
It would be useful to have FRES tested on league competitions, so that better comparisons could be made with more prediction systems that have been reported in the scientific literature. Perhaps the authors are working on that now? It would, for example, be interesting to see if it beats a random method, or a method which always predicts a home win (as the authors did in the paper I discussed a few days ago).