Monday, June 25, 2018

The World Cup in Numbers

By Dr. Patrick English – World Cup Data Hub ‘Chief of Stats’

As we move today into round three of the group stage games, let’s take a look into the numbers so far. With 32 games now played, this blog post looks at the key data from both rounds of group stage games, how successful different formations and styles of play have been (and which kind of teams have been most often using them), and finally what our regression model is saying currently best predicts a winning team at this tournament. 

Firstly, round two of group stage games has been characterised by more goals than round one, less draws, and far more wins for higher ranked teams. In round one, the average goals scored by teams stood at 1.2 per team, per match. At the conclusion of round two, this figure had increased to 1.5 goals per team, per game on average. We’ve also seen two fewer draws in the second round than the first, with the total points-shared outcome falling from six to four. Finally, while in round one top ranked teams (pot one) recorded only two victories, in round two there were six wins for the highest ranked teams in the competition. So perhaps we can, with some notable exceptions, suggest that round two as a shift toward more ‘business as usual’ in terms of footballing hierarchy?

Secondly, three-defender formations appear to be enjoying a considerable amount of success at the tournament so far. Of those formations used more than once, five have a win rate of more than 50%. Of those five, two feature a back three and their combined success rate is 66%. The most successful four-defender formations have been the “4-2-3-1” (54% win rate), and the “4-3-3” and “4-4-2” (each on 50%). 


According to our data, direct attacking remains the most successful style of play, but only just, over possession football, with win rates of 57% and 55% respectively. Interestingly, the “4-2-3-1” formation appears to be by the best option when it comes to beating higher ranked teams, while the attack-minded “4-3-3” appears to be good at doing the job against lower ranked sides (as well as a traditional “4-4-2”).

Some strong, but perhaps quite logical, stories have been developing regarding the usage of formations and styles of play by higher and lower ranked teams at this year’s World Cup. For instance, teams in pot one have used a total of seven different formations so far, with “4-2-3-1” and “3-4-2-1” the most frequently deployed.


Conversely, teams in pot four have used a total of six formations and have been much more likely to field solid back fours, lone strikers, and anchoring defensive midfielders in their games so far this tournament. Interestingly though the most attacking formations seem to be coming from teams in pot two, with the “4-3-3” favoured by teams in this seeding group a total of 5 times.


Teams in pot four have also been using a ‘balanced’ style of play in over half of their total games, while the same is true for the ‘possession’ focus for teams in pots one and two. Teams in pot three have used a ‘defensive’ approach the most – a total of four times.  

Also, as was discussed in the inaugural World Cup Data Hub podcast, there seems to be a strong connection between a higher team FIFA World Ranking and the ability to get more shots on target and at a better rate of accuracy than their opponents. The graph below demonstrates a clear trend between fewer shots on target and lower team ranking. This, combined with the shooting accuracy statistics, suggests that efficiency and accuracy in shooting are a strong component of being a successful footballing team on the world stage.


Finally, the logit regression model – a statistical tool which is being used to figure out what sort of combination of teams and tactics are associated with winning games – is reporting some very interesting findings. Generally speaking, it is pretty much bringing much of the above together into one simple story, which is exactly what regressions are so good for. According to this analysis, the currently strongest predictors of a winning team at the World Cup are those from pot two (though the results also suggest that relative to pot four teams, pot one and pot three sides are also better at winning), playing with fewer defenders, and facing the fewer shots. 


The former result highlights how as well as teams in pot four, many of the highest ranked teams in the competition have been struggling a lot at this year’s tournament – Poland and Argentina (0 wins) spring to mind, but also Portugal and Brazil – currently having only one win each. Conversely, pot two teams have been doing very well (Croatia, England, Mexico and Uruguay spring to mind). The latter two factors, I think, are quite closely connected; thinking back to last week’s podcast, this has very much been a growing theme at the World Cup and perhaps highlights something about the kind of chances and approach play that teams will have against opposition units fielding three centre backs. Teams playing against a back three are probably more likely exploit the flanks and focus on crosses, with a packed-out midfield in front of the defence and three centre backs guarding the perhaps resulting in fewer direct shooting opportunities. 

So, there we have it! The World Cup so far in numbers, stats,  and data. Stay tuned for the next podcast (releasing tomorrow) and for further blogs from the World Cup Data Hub team.