By: Jackson Weaver
Last month marked the beginning of the Major League Soccer season, and it began in classic, unpredictable MLS fashion. Toronto FC, the league winners last year and projected winners this year, have lost four out of five and sit at the bottom of the eastern conference. Los Angeles FC sits second in the Western Conference in their first year in the league, having beaten Seattle, Real Salt Lake, and Vancouver, three well established teams. In MLS, every fan can have legitimate hopes of their team winning the league in a given year, but exactly how random is MLS? How does a team’s record correlate from one year to the next?
To do this, I used a similar technique to current HSAC Co-President Andrew Puopolo’s article about parity in the four major professional North American sports leagues, where he looked at the correlation between a team’s record in one season with their record in each of the next five seasons. To get the data, I scraped the points per game for each team in MLS since the 2000 season. I used average points per game because the number of games each team plays in a season has varied over time.
After doing this, I then repeated the process with the English Premier League in the same time period to compare MLS to another league. I repeated the process of scraping the data and viewing the correlation between a team’s records.
To explore more into this, I then examined the Championship, as the teams are much more equal in terms of the amount of money they possess, meaning the amount of money teams spend is much more similar than in the EPL. Surprisingly, the correlation over five years is similar to that of MLS as the correlation numbers are significantly lower than the EPL.
The English Premier League and the Championship pose a challenge to this process as the teams in both leagues change every year with teams being promoted and relegated. The resolve this problem, when I calculated the points per season, this sequence was considered to be one continuous record. If the team was relegated, their record for the following season was then the record of the team promoted. If the team finished third to last, they were replaced with the top promoted team, if the team finished second to last, they were replaced with the second promoted team, and if the team was last, they were replaced with the winner of the promotion playoffs.
For example, in 2000 Sheffield Wednesday placed 19th and was relegated and replaced by Manchester City who finished 2nd in the Championship. Then in 2001 Manchester City was relegated after placing 18th and replaced by Fulham, who finished 1st. Fulham lasted 13 years in the Premier League before being relegated in 2014 after finishing 19th, while Burnley finishing 2nd and was promoted, and then a year later, Burnley was relegated after finishing 19th and replaced by 2nd-place Watford . In terms of the data collection, all of these teams records over the time periods that they were in the Premier League are recorded under Sheffield Wednesday, as they were the original team in 2000. This process was repeated for every team this happened to in the Premier League, and was repeated for both promotion and relegation in the Championship.
A potential source of error in these findings is the fact that MLS has added many expansion teams over the past seventeen years, and the fact that after a year in the league teams are better adapted and thus improve may cause lower levels of correlations. Additionally, there are potential pitfalls with the method of how the EPL and Championship due to how the relegated teams were dealt with. Ultimately, they are not the same teams and their skill levels are different which can create lower correlation values on the promoted teams’ first season in the league. This can be amplified with the Championship as you have even more flux with teams being both promoted and relegated.
Unsurprisingly, the correlation of the English Premier League is significantly higher than in MLS, meaning there is much more predictability from year to year in the EPL than in MLS. Interestingly, the correlation of the Championship is much more comparable to MLS rather than the EPL. This supports the theory that leagues where the money is more evenly divided, such as MLS and the Championship, provide a much more even playing field than leagues where the money is significantly skewed, such as the EPL. With correlation like MLS, you have stories like the San Jose Earthquakes, who placed 14th overall in 2011 and then 1st in 2012, but also stories of the Colorado Rapids falling from 2nd to 20th in 2017. It will be very interesting to see how this MLS season plays out, because at this point, it is anybody’s game.
*Note: The San Jose Earthquakes moved to Houston and became the Houston Dynamo in 2006, and an ownership group bought the Earthquakes again in 2008. For the purpose of the data, the Houston Dynamo goes back to 1996, while the San Jose Earthquakes go back to 2008.
Editors Note: If you have any questions about this article, please feel free to reach out to Jackson at jacksonweaver@college.harvard.edu