The Statistically Significant Four

Gurudev Ilangovan

September 1, 2017


Why the big 4 deserve all the praise they get…

We all know tennis is a competitive sport. Winning one slam is hard so much so that it is probably enough. In the chart below, you might recognize a few of those small bars in the far right - Roger Federer with 19 grand slam titles, Rafa with 16 and so on. But try to recollect five people who constitute the single big bar on the left hand side - people who have won just one slam!

Pretty hard, right? We do not remember many of those who have managed to win just one title (Andy Roddick - 2003 US Open, Goran Ivanisevic - 2001 Wimbledon (being a wildcard!)). However, did you notice that the x-axis starts from 1? What happens if we start from 0?

What just happened? The single tall bar represents the number people who competed but never won a slam in their entire career. The the next tallest bar is the number of people who have a single title to their name, as seen from the previous graph, but it has been completely dwarfed. Tennis isn’t competitive. It is brutal. If you meet a player today who says they are going to take tennis up as a profession, odds are quite high that they would be slam-less even if you meet them fifteen years hence.

Now that we’ve seen how insane the competition is, we know how difficult it is to win a single slam. It is a big deal. And amidst this extremely vicious competitiveness, 3 people, together over the course of 15 years, have won 47 grand slam titles (out of the possible 60). You read that right. Enter the big three .. the fourth comes in later.

Tennis has always had greats in every era. Why then is this era called as the golden era in tennis? More specifically, why is the quartet of Roger Federer, Rafael Nadal, Novak Djokovic and Andy Murray called as the big four?

First, let’s take a customary look at the slam counts for players with more than 2 slams. The bars are color coded based on 10 year periods starting from 2017 and going back all the way to 1968.

Apart from the fact that 3 in the top 4 slam winners are from the golden era, we can see that the darkgreen (2008-2017) is pretty much split between Roger, Rafa and Novak. The other colors (say, pink) are much more distributed. In terms of just the number of grand slams, it is clearly the Big 3.

Let’s take a simplified version of this idea.

  1. Split 1968-2017 into 10-year chunks resulting in 5 separate periods.
  2. For each period, calculate who the top 3 players are in terms of number of slams won.
  3. Calculate the proportion of slams won by the top 3.

Clearly, 2007-2017 has a higher proportion of big winners! That’s over a 25% gap. However, an argument could be made for the way the years have been aggregated. It could be that 2007-2017 is where the Big 3 dominated. But cutting greats’ careers in half would be doing a disservice to their dominance. For instance, Jimmy Connors played from 1972 to the 90s. His career would be split into three different parts if we aggregate the years from 1977-1986 The right thing to do would be to find out the most dominant 10 years in the Big 3’s long careers and compare it with another 10 year period where 3 superior players have dominated the sport.

So, for each year from 1968, let’s aggregate the 10 following years. That means 1968-1977 would be a 10 year aggregation, 1969-1978 would be another and so on. For each split/aggreation, let us calculate the average number of slams won by the top 3 players.

start_year end_year avg_slams
2004 2013 11.666667
2005 2014 11.333333
2006 2015 11.333333
2003 2012 11.000000
2007 2016 10.666667
2002 2011 10.000000
2000 2009 8.000000
1974 1983 7.666667
1999 2008 7.666667
1975 1984 7.333333

Just out of curiosity, just how many of these aggregations are connected to the Big 3?

In the top 10 aggregations, 8 feature the Big 3. Staggering.

But coming back to the point, the next most dominant era has been 1974-1983. It seems we have been splitting the years wrong… a little. Let’s take a look at the winners in this period.

winner_name n
Bjorn Borg 11
Jimmy Connors 7
John Mcenroe 5
Guillermo Vilas 4
Johan Kriek 2
Mats Wilander 2
Adriano Panatta 1
Arthur Ashe 1
Brian Teacher 1
John Newcombe 1
Manuel Orantes 1
Mark Edmondson 1
Roscoe Tanner 1
Yannick Noah 1

An average of 7.67 slams won by each member of the top 3 but a total of 14 different grand slam winners in this 10 year period.

Now let’s take a look at the iron grip of the current Big 3 from 2004-2013.

winner_name n
Roger Federer 16
Rafael Nadal 13
Novak Djokovic 6
Andy Murray 2
Gaston Gaudio 1
Juan Martin Del Potro 1
Marat Safin 1

An average of 11.67 slams - a solid year’s worth of slams more than the next best era. In stark contrast to the previous list, we have just 7 different winners (only half as much as the number in 1974-1983). The Big 3 are significantly better than the 3 biggest players of any other period, period.

So, how did the term “Big 4” come up? Andy Murray has won 3 slams and so has Stan Wawrinka. Why does Murray get to sit alongside these freaks?

Slams tell only one part of the story. They are after all 4 events in a year and judging the state of things by them alone doesn’t do justice. To portray a more complete picture, let’s also consider the Masters 1000 tournaments, the most prestigious tournaments after Grand slams and Olympics. Andy Murray’s exceptional success outside of the slams explains why it is the Big 4 and why he gets included in it.

Andy’s a freak in his own right. He’s right there amongst the greats. If Andy had played in any other era, he would probably be having 10 slams. It does seem that it’s not unjustified to call it the era of Big 4.

What is amazing is that tennis has gotten faster, more powerful and more taxing over the ages and going by the trend, it will only become more so. This means that tennis is increasingly becoming a young man’s game, sapping the players, pushing the envelope when it comes to their conditioning and due to these very reasons, causing a lot of injuries. Despite all that, here they are atop their thrones, old but hungry as ever, making unsuspecting analyses like this one outdated every passing year.

Acknowledgements

  1. Jeff Sakcman’s github repository gave us all the tennis data that we needed for the analysis. It can be found here. He regularly writes many interesting articles like this.
  2. Amber Thomas’s website served as both the design and content inspiration for the blog.
comments powered by Disqus