In some of my posts I comment on a scientific paper that has caught my eye. There is no particular reason for the papers that I choose, they are just of interest to me. In this post, the paper that caught my eye was (comments on other papers can be seen here).

Martin Spann and Bernd Skiera (2009) *Sports forecasting: a comparison of the forecast accuracy of prediction markets, betting odds and tipsters*, Journal of Forecasting, 28(1), 55-72 (doi).

This paper looks at three different prediction methods, and assesses their effectiveness in predicting the outcome of premier league matches from the German football league. The three methods that are investigated are prediction markets, tipsters and betting odds.

**Prediction Markets** are based on various people taking a stance on the same event and willing to back their hunch by paying (or collecting) money should their hunch be wrong (or right). Given that a number of people are taking a stance on the same event, it can be seen as a predictive model of the event.

**Tipsters** are (or should be) the views of experts who publish their predictions in newspapers, on web sites etc. The advice from tipsters is often based on their expertise, rather than applying some system or formal model. The paper (citing Forrest and Simmons, 2000 as its source) says that tipsters can often beat a random selection method, but does worse than simply choosing a home win every time. It also cites Andersson et al., saying that soccer experts often do worse that people who are less well informed about the game.

**Betting Odds**, in previous work, have found to be a good forecasting method (not surprising I suppose seeing as the bookmakers rely on setting the correct prices to make their living). Of course, the bookmakers can change their odds but when they publish fixed odds (on say a special betting coupon), this can be seen a prediction of the match outcome.

The games that are forecast in this paper are those from the German premier league from three seasons (1999-2000, 2000-2001 and 2001-2002). The number of games predicted by each method varied (Prediction Markets and Betting Odds = 837, Tipsters = 721 and Prediction Markets, Betting Odds and Tipsters = 678). The number of predictions for each method varied simply due to the data that was available and where the number of games is between two, or three, methods, this is the intersection of the games that that method was able to predict.

To evaluate each method, the authors calculate the percentage of correct predictions. They also calculate the root mean squared error, as well as the amount of money that each method would have won (three figures are given, a 25% fee, a 12% fee and no fee). Comparisons are also made with a random selection policy as well as a naive selection policy, which simply assume a home win.

So, what did the authors find? Over the 837 games, the prediction market and betting odds were able to predict 52.69% and 52.93% of games respectively. If there was no fee this would have returned a profit of 12.30% and 11.92% respectively. The naive model (pick home wins) predicted 50.42% correctly and returned a profit of 11.79% The random method only managed to predict 37.98% of games correctly.

If we look at the 678 games that all three methods could predict, then the percentages of correct predictions were 54.28% (prediction market), 53.69% (betting odds), 42.63% (tipsters), 50.88% (naive model) and 37.98% (random). The returns (assuming no fee) were 16.20% (prediction market), 13.49% (betting odds), -0.19% (tipsters) and 12.44% (naive model).

I’m not sure why, but profit information is not given for the random model but it would almost certainly result in a loss.

A further test is also carried out. Only games where methods agree on the selection are *bet* upon. For example, of the 678 games, there are 380 games where the three methods agree on the result. If we only bet on those games, we get a correct prediction percentage of 57.11%, higher than any of the methods used in isolation, and betting on every game. The profit return would be 13.86% (no fee), 1.66% (12% fee) and -8.72% (25% fee).

The authors conclude that the prediction market and the betting odds provide the best indication of the outcome. They agree with previous work that tipsters are generally quite poor at prediction.