If you’re even just a casual tennis fan like me, then you’ve seen the report of a major scandal brewing on the professional tour. A joint investigation by BuzzFeed and the BBC cheekily called “The Tennis Racket” has purportedly uncovered evidence of widespread match-fixing. While the story, itself, is sensational, I’m particularly interested in three very important lessons that we can learn from this alleged betting scandal about getting it right with analytics.
FiveThirtyEight.com has a great overview of the story, along with some interesting observations about what conclusions can be drawn regarding the innocence or guilt of the players identified in the investigation. Here’s a synopsis of what BuzzFeed and the BBC did, and what happened afterward: To identify evidence of potential match-fixing, the investigators analyzed betting odds in multiple sports books for 26,000 men’s tennis matches played over a number of years. The investigators sought out situations where the odds of a player losing moved up substantially in the lead-up to the match, followed by that player, indeed, losing. Such situations could be indicative of bettors having inside information about a player’s likelihood to lose a match in advance. The investigators flagged 15 players for whom such odds movements took place very frequently and suggested that these individuals were quite likely to be involved in match-fixing.
Here’s where things got really interesting: While Buzzfeed and the BBC didn’t name the 15 players, they did release an anonymized data file that had been used in the analysis. It took only a handful of hours for multiple different analysts to identify who the 15 players were. While most of the names might not be familiar to casual fans like me, one does stand out: former world No. 1, Lleyton Hewitt.
With such a prominent name identified, people began to dig deeper. One of the analysts who unmasked the 15 names took a careful look at eight different Hewitt matches that were flagged as suspicious. In each case, a better understanding of the underlying dynamics showed that there was little reason to believe that anything untoward took place. Additionally, when analysts applied a subtle change to the investigators’ methodology for identifying potentially fixed matches, they came up with a very different list of suspected players. Investigations remain ongoing.
I took away three lessons about analytics from this story:
- Analysts are able to do amazing things with data today. It took mere hours for analysts to identify BuzzFeed-BBC’s 15 suspects from the anonymous data. The combination of clever thinking and modern processing technology is powerful, indeed.
- Domain expertise is as important as analytical prowess. When an expert who really understood tennis betting examined Lleyton Hewitt’s matches, he was able to show that the investigators’ algorithm missed the boat. A carefully constructed algorithm didn’t stand up to the scrutiny of a tennis betting expert who applied far more rudimentary techniques.
- Careful analysis design is critical. A subtle change in how tennis betting odds were analyzed yielded very different conclusions. We should take that to heart when we analyze data to solve difficult business problems. The more complex the data, the more time required in detailed analysis design to ensure that you get the right answers.
ARTICLE: Gaining Maturity
ARTICLE: Navigating Big Data for Big Profits