esportsdata science

How Well Did my 2020 LCS, LCK, and LEC Prediction Models Perform?

Now that the 2020 League of Legends World Championship is over, I thought it might be time to asses how well the predictions models I wrote for the LCS, LCK, and LEC performed. You can view them here.

Firstly, it's important to know what exactly the predictions mean in the first place. A prediction which favours one team over the over does not guarantee that the favoured team will win. In fact, even when the favourite has a 90% chance to win, the underdog should still win 1 in 10 matches.

Cloud9 vs Team Liquid

Knowing this, the best way to judge whether the model is accurate or not would be to check whether events that have a x% chance of happening actually happen x% of the time. The prediction data is hosted on github. After downloading the match data and running it through a quick python script, this was the result.

Overall performance

I grouped the predictions together into 5% chunks. For example, all the predictions between 57.5% and 62.5% get grouped into 60%. The average of this group is then the actual plotted value.

The closer the points are to the dotted line, the more accurate the prediction is. If the prediction was perfect, all points would lie along the line.

There were 966 matches played in 2020. The size of the points corresponds to the number of predictions with that expected outcome. For perspective, out of the 1932 predictions, 256 are included in the 0.5 expected outcome point, but just 4 in the 0.95 expected outcome point. This explains why things get a little off track at the extremes. The number of matches played between teams with that big of a rating difference is low so it is hard to make any concrete conclusions about accuracy of the model without more matches being played.

You can see that at the 0.9 expected outcome point, these events actually happened 100% of the time, instead of 90%. However, there are only 17 matches played, of which, the favourite won all 17, when instead the model thinks the underdog should have won at least 1 of these matches.

There is a similar thing at the 0.95 expected outcome point. In the 4 games played here, the underdog did win 1 of these games, when they were predicted to win just 1 in 20.

Red vs. Blue

Unlike more traditional sports, the home team advantage doesn't exist as much in League of Legends. All regular season matches are played in the same studio (or online in 2020) and the playoff finals are held in a stadium within the leagues region. The teams also don't usually represent a city, region, or country.

There is however an advantage for the team playing on the blue side of the map. These sort of advantages are resolved in traditional sport by the teams swapping sides at half time. This isn't the case in League of Legends.

The reason for this advantage is complicated and is made up of many factors including gameplay advantages and in the champion pick/ban order.

I noticed this advantage clearly showing in the 2020 match data when separating the red and blue side teams.

Red vs. Blue

At almost every data point, the actual outcome for the blue side teams is higher than both the expected outcome and the red side teams. Overall, the win rate for blue side teams is 53.93%, red side 46.07%, a decent edge for the blue side. However there seems to be some significant variation of this between regions.

RegionBlue Side Win %Red Side Win %

In the LEC it looks as though the red side has the advantage. This is a surprise. There were 241 games played in the LEC this in 2020 so it could just be random chance that red side managed to pull off more wins than expected. It could also just be the regional meta of the LEC. There might be something in the way that LEC teams play which gives the red side an advantage. It's hard to tell.

The question now is whether this should be added to the model to counter the effects of a side advantage. This isn't as easy as adding a home team advantage to a traditional sports model, or even an advantage to the player playing white in chess. Partially due to differences between regions, but also because the game keeps changing. During the competitive season there are marginal changes that effect gameplay and between seasons there are large changes introduced which can have a big impact on side advantage.

Which side teams will be playing is also often not known until moments before the match starts, particularly during playoff matches and best-of series. This means accurate predictions will not be possible until the sides are known.

At the end of the day, these predictions aren't meant to be used for anything serious (like betting, it probably won't go well). It's just a little project I started after being inspired by some of FiveThirtyEight's sports predictions. I think I will add a blue side advantage for the 2021 season based on the 2020 numbers and then see how things perform with and without it. Besides that I'm pretty happy with the performance of the model in 2020.

Made with 💜 in Tāmaki Makaurau, Aotearoa. © Jono Davis.