Still, there was one comment that seemed unaddressed in all of this analysis. Commenter Damon Hass stated:
It seems to me that you would sure want to at least mention goal differential in an article talking about luck in the table.Damon and I exchanged a few emails on the topic, and I reassured him that
- I certainly agreed with him that goal differential could help explain teams’ luck relative to each other, and
- I would cover the issue of goal differential and it’s role in forecasting the likelihood of last year’s table in this blog’s next post.
Last year’s table is shown below. What Damon was surely pointing to was Tottenham Hotspur’s 72 points on a +20 goal differential that looks a bit optimistic or lucky compared to Arsenal’s 73 points on a +35 goal differential or Chelsea’s 75 points on a +36 goal differential. This isn’t a difference of a few balls bouncing one way or another based upon luck, but a clear difference in goal differentials. Damon is certainly correct to look Tottenham’s goal differential and point total with skepticism. So how does one quantify the relationship between the two to form a more numerical judgement?
It turns out that there is as strong a
relationship between the two as we might intuitively suspect. The graph
below plots the goal differentials and resultant points for every
38-match Premier League season to date. Readers will note the high
degree of correlation between the two metrics given an R2 value of 0.92.
The graph contains not only the traditional regression line, but also those related to the upper and lower 50th and 90th percentile prediction intervals. These prediction intervals communicate the impact of random behavior the regression model does not capture. The lower the R2
value, the tighter the prediction intervals. They also indicate the
bounds of the range of values in the dependent variable (in this case,
points) that can be expected for a given independent variable value (in
this case, goal differential). The 50th percentile
prediction interval defines the range of values that make up the middle
50% of points given a known goal differential, and the 90th
percentile the middle 90% of points. As an example, 50% of the teams
with a +20 goal differential would have a point total between 62 and 68
points.
The standard deviation in points expected
from a given goal differential can also be calculated from the
prediction intervals. It turns out that the standard deviation is 4.52
points for a known goal differential, which is a fair bit smaller than
the original estimate of seven to eight points. This standard deviation
can now be used along with the regression equation in the graph above
to estimate the expected points for each club as well as the percentile
within the expected points distribution that their actual point totals
represented. The table below is the 2012/13 EPL table updated with this
information. Rows that are red represent teams with point totals in
the top or bottom 5% of the distribution of points expected from their
goal differential, while yellow rows are those that are in the top or
bottom 20%. These colored rows represent outliers within the data table
Damon Hass’s suspicion regarding Tottenham’s point total has been
confirmed, but it turns out they weren’t the biggest overachievers
relative to their goal differential. That title belongs to Manchester
United, who outperformed 98% of the expected point totals associated
with a +43 goal differential. To put it another way, there is a 2%
chance a team could have a +43 goal differential and end up with
United’s 89 points. It turns out that their point total should have
been closer to 80. The table also suggests that Chelsea and Arsenal
would finish a bit ahead of Manchester City on average, and Liverpool
was the biggest underperformer in the table. The Reds should have been
competing for a Champions League spot last year given their goal
differential.How do we answer just how lucky Manchester United was to win last year’s title, or Tottenham Hotspur to be competing for a Champions League position through the final game of last season? The answer lies in Monte Carlo simulation. In this case, each team’s projected point total based upon goal differential is the starting point, with 10,000 normally distributed random numbers generated from the projected point total with a standard deviation of 4.52 points. The resultant distribution of point totals is shown in the graph below.
Readers will note a few differences between the graph above and the one from the previous post. First, the range of x-values is much smaller in the graph above, and the values on the y-axis approximately double in value. This is due to the narrower standard deviation in points used in the more recent simulation. The distributions of point totals in the graph above are also a bit closer to each other, and have been slightly re-ordered from left to right. This is due to the use of the goal differential as the starting point for the distribution. Perhaps the graph above does a better job of displaying the closeness of competition in the Premier League last season as judged by our eyes and goals scored that the actual table didn’t capture?
The point distributions above are then translated into resultant table positions that are summarized in the graph below.
It turns out that even though Manchester Untied had the highest percentile of points versus goal differential, they still had a high enough goal differential that would have won them the title 63% of the time. It was far from a dominant performance, but certainly good enough to suggest they earned the title. It turns out Tottenham’s goal differential certainly suggested they overachieved in the table last year. They had a less than 10% chance of finishing in the Top Four, yet challenged for such a position up until the last day of the season. The methodology outlined in the last post suggested that Tottenham had a 59% chance of Champions League qualification given their 72 points and a standard deviation of 8.18 points.
None of the analysis above changes the fact that one less goal for Arsenal or one more for Tottenham would have likely reveresed their actual positions. It also doesn’t change the fact that repeating such a one-point gap means eventually Arsenal will lose out luckwise and will find themselves missing out on the Champions League. What the above analysis does suggest is that last year Tottenham perhaps got by on a lot more luck than Arsenal. They shouldn’t have been competing for a Champions League position given their goal differential, and they will eventually regress towards the fifth or sixth table positions without an improvement in goal differential.
So if we know that Manchester United and Tottenham greatly overachieved points wise versus their goal differential, can we draw any conclusions about what their point totals might look like this season? Also, what aspects of the goal differential – goal scoring, goal conceding, or both – need to be addressed at both clubs as the current season goes on? The answers to both questions will be covered in a later post. Until then we know that Manchester United seemed to be a deserving champion last year, while Tottenham Hotspur was very lucky to even be competing for a fourth place table position.
No comments:
Post a Comment