Studying the GOAT race from the lens of Roger Federer, Novak Djokovic & Rafael Nadal's career points distribution
For nearly half a decade now, Roger Federer, Rafael Nadal and Novak Djokovic have been at the center of arguably the most intense GOAT debate in tennis history. And Djokovic's recent win at Roland Garros has added further fuel to - or even sealed, according to some - the debate.
Novak Djokovic is being labeled as the most 'complete' and 'versatile' player among the Big 3 because of the fact that he is the only one to have won each Slam twice. But the matter may not be as simple as the numbers indicate.
Before we go any further, here's a brief overview of the premise of this analysis. Consider the two most extreme distributions of a hypothetical set of 20 Grand Slams, say A = {20, 0, 0, 0} vs B = {5, 5, 5, 5}. Player A, in this example, is as accomplished as Player B. And B's versatility across venues is countered by A's unmatched dominance in one.
Historically speaking, though, five itself is a very dominant number at any Grand Slam. Thus, while maintaining versatility of the highest order, Player B also ticks the dominance box.
Player A, however, has done little to establish his completeness in his bid to be considered the greater player. He has bettered something he was already the best at by a distant margin, while not addressing his weaknesses.
While the tally of 20 Grand Slams seems to have become the benchmark for GOAT-level accomplishments in men's tennis, the above distributions are far from reality. However, the element of completeness undeniably adds weight to the legacy of its owners.
And that is largely why Novak Djokovic sent his fans and detractors alike into a frenzy when he completed the elusive 'Double Career Grand Slam' last week. So how do we measure the Serb's feat from the lens of the 'completeness' factor?
The phases within a year
In order to study the all-round performances of the Big 3 across a calendar year, I have divided the tennis season into five phases:
Phase 1 includes all outdoor hardcourt events before Roland Garros.
Phase 2 includes all tournaments on clay.
Phase 3 includes all tournaments on grass.
Phase 4 includes all outdoor hardcourt tournaments between Wimbledon and the US Open.
Phase 5 includes all tournaments on indoor hardcourt and carpet, as well as those on outdoor hardcourt which are played after the US Open.
*The Olympics have been included in either Phase 3 or Phase 4, depending on the surface they were played on.
**Team competitions such as Davis Cup, Hopman Cup, Laver Cup and the ATP Cup have been excluded from the study, since progress in team events is dependent on teammates.
The reason behind splitting the entire hardcourt calendar into three parts is their varying behavior across venues and conditions.
Career points accumulated by Roger Federer, Rafael Nadal and Novak Djokovic
Career points, as the name suggests, are the points a player has earned across their entire career from all the tournaments they have taken part in (excluding team events, as mentioned earlier). But to calculate that, we first have to make a couple of adjustments - in order to account for the changes in rules over time.
The breakdown of the points that we have today was implemented at the end of 2008. And it was reflected in the 52-week rolling rankings for the first time at the start of 2009.
To account for that change, points earned prior to the establishment of the current system have been scaled appropriately. Grand Slams have been re-scaled to 2000 points, while every other point earned has been transformed proportionately too.
For example, if a tournament worth 400 points was played at a time when the Grand Slams were worth 1,000, that tournament has been transformed into an 800-point tournament for the computation of career points.
Additionally, the stepwise deduction in points with every lower round has been kept consistent with today's system.
Under the present structure, a runner-up earns 60% of the total points on offer, but previously that number used to be 70%. So say a player finished second at a tournament that was worth 300 points in 2006 (when Grand Slams were worth 1000). That would have earned him a total of 210 points back then. But for the purpose of our analysis, his scaled score from that tournament would be 60% of 2x300, i.e. 360.
Before assessing their 'completeness', let's also take a look at the career earnings of Roger Federer, Rafael Nadal and Novak Djokovic in terms of points. The tables below comprise all results up to the completion of Roland Garros 2021.
Roger Federer's career points across the 5 phases
Roger Federer has earned 1,69,692 points in his career, with the following phase-wise distribution:
- Phase 1: 42,970
- Phase 2: 28,112
- Phase 3: 29,386
- Phase 4: 30,422
- Phase 5: 38,802
Rafael Nadal's career points across the 5 phases
Rafael Nadal has earned 1,44,606 points in his career, with the following phase-wise distribution:
- Phase 1: 23,900
- Phase 2: 73,206
- Phase 3: 10,632
- Phase 4: 22,350
- Phase 5: 14,518
Novak Djokovic's career points across the 5 phases
Novak Djokovic has earned 1,44,749 points in his career, with the following phase-wise distribution:
- Phase 1: 37,734
- Phase 2: 34,824
- Phase 3: 15,809
- Phase 4: 26,803
- Phase 5: 29,579
Note: Since the Serb took part in two tournaments at Belgrade during the clay swing this year, his lesser result out of the two has been shifted to 2020.
The spread factor
Lesser the dispersion, greater is the player's all-round completeness at a given tally of Grand Slams won. So whenever a player adds a trophy at his least successful venue, his tally there approaches closer to his mean - thereby reducing the dispersion and making him 'more complete'.
Conversely, whenever a title is added to his best number, he distances himself from the mean. That increases the dispersion and makes him 'less complete'.
So will Rafael Nadal be less versatile as a Grand Slam champion than he is now if his next Major win is a 14th Roland Garros title? Yes! As will Roger Federer if he wins anything other than the French Open, and Novak Djokovic if he wins Wimbledon or the Australian Open.
These titles would carry the tallies of the Big 3 at the respective venues away from the mean - which is currently five for the Swiss and the Spaniard, and 4.75 for the Serb.
However, the important thing to note here is that their 'reduced completeness' is merely a by-product of their increased success. Thus, even if we acknowledge versatility as a driving parameter for greatness, an objective comparison can be drawn only if they have the same, or at least similar totals.
On that note, here is a slab-wise breakdown of career points earned by the Big 3 and their corresponding spread factor across the five phases:
The spread factor used in the above table is an inverted measure of dispersion. A higher value of the coefficient indicates lesser variation, and hence greater all-roundedness.
The concept of versatility is further extended to accommodate all career winnings in terms of points. For the slabs, a given value N indicates points earned in excess of, or at least equal to N from any tournament.
The slab-wise break-up has been provided to outline how points were earned. After all, 2000 points earned by winning a Grand Slam is not the same as that earned by winning 200 points in 10 different editions.
Determining the dispersion across Grand Slams is elementary. However, when we attempt the same for every point earned, we need to remove the bias arising out of varying phase lengths.
For example, considering all his tournament participations across his career, Roger Federer played for 52,425 points on grass and earned 29,386 from there. At the same time, he played for 88,959 points on clay, scoring 28,112.
So in order to assess his completeness at the elementary level, we have worked with his respective fractions for all the phases - such as 29,386/52,425 for phase 3 and 28,112/88,959 for phase 2.
It must be stated here though that 'participation' isn't being used in a negative light. Whenever a player takes to the court, he doesn't sign up to subtract anything from his existing career. The spread of his participation is merely being compared with the spread of his points in order to illustrate his all-roundedness.
The weight of a matchup
Novak Djokovic's win over Rafael Nadal in the Roland Garros semifinals was an amazing feat. However, by reaching the stage where they squared off, Nadal objectively added more to his legacy than what he would have with an earlier exit.
Would the Spaniard's early elimination or absence from the semis have reduced the worth of Djokovic's title-winning run, even though the Serb committed no fault of his own? If not, then how does Nadal's presence objectively increase its value?
Certain victories and defeats assume greater significance purely because of emotions. The less you analyze them, the more you 'feel' them.
Using such solitary, emotion-driven matchups to rank greatness undermines the legacy that the greats have collectively built through decades of toil. Despite getting his win percentage lessened at Court Philippe Chatrier, Rafael Nadal only added to his resume during his recent visit to Paris.
As did Novak Djokovic on the umpteen occasions he missed out earlier. And as did Roger Federer.
Chasing Roger Federer?
Novak Djokovic must surely have sealed the deal in the completeness department on Sunday, right? Not quite.
Surprisingly enough, at the highest level of achievement, Djokovic still doesn't have the best spread. If that idea seems preposterous, let's compare the deviations from his mean (4.75) with Roger Federer's (5).
- 9 - 4.75 > 5 - 1 (AO vs RG)
- 5 - 4.75 > 5 - 5 (Wimbledon vs USO)
- 4.75 - 3 > 6 - 5 (USO vs AO)
- 4.75 - 2 < 8 - 5 (RG vs Wimbledon)
Note: The inequalities above have been arranged with respect to the Slams where Djokovic and Federer have comparable deviations from their mean. Thus, Djokovic's Australian Open - where the Serb has the highest deviation from his mean - has been compared with Federer's Roland Garros (which is the Swiss' highest deviation from his mean). Similarly, Wimbledon represents the lowest deviation from the mean for Djokovic and the corresponding Slam for Federer is the US Open, so those two have been taken together too.
Federer's advantage in this department is further highlighted by the fact that if Djokovic adds a Grand Slam at either of his two favorite venues, he would still trail in the spread factor.
On the other hand, it is seemingly impossible for Rafael Nadal to numerically challenge his two rivals in terms of leveling out his performances. But if the southpaw is the first among the three to add a Major anywhere, he becomes the individual owner of the Grand Slam record. And that can always challenge others' claims at the top - since there is no formula that undisputedly compensates for a deficit in the totals.
But even if you leave aside the spread, Roger Federer always seems to be around. Despite all the times he has been victorious, it is his consistent challenge at the top when he has not lifted the title that has stood out.
A quick glance at Table 1 reveals Roger Federer's stranglehold at the top. At the highest level, he holds the cards both in terms of points won (tied with Rafael Nadal's) and their distribution (slightly better than Novak Djokovic's).
As we move down the slabs, Federer dominates even further. He is briefly surpassed by Djokovic in the 1000 and 800 slabs before going on to re-establish his lead with his Grand Slam semifinals and Masters runner-up finishes.
But by that point, Novak Djokovic has taken an unassailable lead in the spread factor - a testament to how he has proven his flexibility across surfaces. That said, his closest competitor in that department is operating with higher totals.
In four of the five phases, Roger Federer registers higher scores than Novak Djokovic, and the difference maxes out in phase 3 with 13,577 points. Meanwhile, Djokovic scores 6,712 more points in phase 2.
Both Federer and Djokovic lead Nadal in four of the five phases, whereas Nadal registers the biggest difference over both on clay.
Djokovic's spread is definitely a telling factor when compared to Nadal's, because they have racked up similar scores on the whole across their careers. Moreover, except for one fewer Grand Slam, Djokovic leads the Spaniard all the way through.
But against Federer, the Serb's advantage is turned against him as it is the Swiss who keeps his nose ahead almost entirely. And Federer's deficit in completeness at the lower levels is more than compensated for by his greater winnings there.
The age factor
Taking into account his dominance, consistency and versatility, Roger Federer will always have something going for himself. If his Grand Slam count is surpassed, their distribution is still likely to be in his favor. If his overall distribution of performance works against him on a relative note, his consistency in earning points at every level will be a factor.
But the Swiss is nearing 40 years of age. And Novak Djokovic, merely 34, is in a position to surpass all his numbers in a couple of years. Rafael Nadal, only a year older than the Serb, is expected to contribute significantly to the race as well.
But while the chase appears like a sprint on the surface, it can end up being a marathon.
Considering career points, Rafael Nadal had the best start to his career among the three. Roger Federer and Novak Djokovic maximized their gains in the mid-20s. And regardless of the path they took to get there, the trio seemingly converged as they hit 30.
From then onwards, there has been very little to separate their progress from each other's. And the difference only appears to narrow down with each passing year.
At 34, Roger Federer's score read 1,44,002, Rafael Nadal had 1,41,666, and Novak Djokovic's reads 1,44,749. The Serb might add a few thousand more by the end of the year, but it will remain close nonetheless.
So what would be a 'safer' post-35 prediction for Nadal and Djokovic's progress? Do they keep following Federer's trajectory? Or does one curve spike while the other flattens out?
The race for greatness is a very engaging idea for the fans. And while the foundation of 20-20-19 seems interesting, the building blocks and their arrangement add an even more exciting dimension to it.
It takes you away from your usual all-or-nothing approach in viewing these legends and puts you in a win-win situation. You get to appreciate every match, AND you get to keep your beloved GOAT race alive.