By Andrew Puopolo
This weekend marks the 3rd round of the FA Cup on the British soccer calendar. For those (presumably American) readers that are unfamiliar with the FA Cup, it is a competition very similar in structure and tradition to March Madness. Every club, professional or amateur, is invited to take part in the qualifying process that begins in August of each year until 20 teams from the lower leagues (3rd division or below) qualify for the 3rd round, which traditionally takes place the first weekend in January. Therefore, theoretically, any team at all (with a little bit of luck) could have the chance to take on Manchester United at Old Trafford. The FA Cup has recorded its fair share of Cinderella moments and “giant-killings” over the years. For example, in 2013, the fifth division’s Luton Town defeated the Premier League’s Norwich City.
Based on a history of giant-killings, I decided to try and determine the probabilities of teams getting to a certain round given their divisional status (i.e. what is the probability a team gets to the quarterfinals given that they are in the second division), and in effect determine the probability of a true Cinderella run that would hit a 10 on the “Dickie V moaning in excitement” scale. Unfortunately, because of promotion and relegation, the divisional status of each team changes from year to year and I was only able to easily find divisional status for each specific season for the last four years, which I therefore used as my sample size.
Unlike March Madness, there is no seeding in the FA Cup. In every round, all the participating teams are put into a hat and matchups are determined by pulling the teams out, with the team to come out first set to play the match at their home stadium.
First, I went through every single FA Cup game over the last four years to determine the outcomes of match-ups across given specific divisions, which can be translated to probabilities. This is reflected in the table below as the fraction of games won by teams from a given row against teams from a given column. Note: any team from below the fourth division was classified as Non-League.
It should also be noted that in the last 4 years, the third round has been comprised of 20 Premier League teams and 24 Championship teams (because the 3rd round is when these teams enter the competition) as well as an average of 9.75 teams from League One, 6.25 teams from League Two and 4 Non-League teams. Based on the probabilities above, I was able to calculate the expected number of teams to appear in each round of the cup. Note: Because the sample size for matchups between teams from the lower three categories was too small, I gave them each a 50% chance of beating each other regardless of divisional status. This is a reasonable assumption to make because the further down the leagues, the income disparity levels off and the teams are of a much closer caliber. Below is my expected number of teams remaining in each round of the FA Cup. Also the probability of a team from a given league advancing to the next round is 50% when playing a divisional foe. I proceeded by taking the probability of drawing a team from each respective division and multiplying the probability of beating a team from that division. I then used the formula for expected value to find the expected number of teams from each division in each round. This needed to be repeated for each subsequent round because as the rounds went on, the ratio of Premier League teams increased and thus the probability of advancing for teams in the lower leagues decreased.
Then, using these expected values, I was able to determine the probability of a given team from a specific division reaching a specific round of the competition by multiplying the probabilities of advancing through each round of the competition. My results are below, with percentages determined to the nearest two decimal points.
Some oddities do occur from the small sample size. For example, Luton Town’s victory over Norwich City in 2013 creates the illusion that the probability of a Non-League side defeating a Premier League side is one in eight, when in reality it is much, much lower than that. It should also be noted that the probabilities for the teams in the lower three categories are the probabilities given that they reach the third round of the competition. There are 24 teams in League One but only 9.75 on average qualified for Round 3, likewise for League Two but only 6.25 on average qualified for Round 3. Based on these numbers, however, we should see a Non-League team reach the final of the FA Cup about once every 125 years.