Michael Eriksson's Blog

A Swede in Germany

Posts Tagged ‘Nadal

Follow-up II: Djokovic as GOAT? (III) and COVID distortions

with 5 comments

Disclaimer: Proof-reading the below, I realize that I have neglected to give the other players their lost Wimbledon points in the comparison—the text was simply thrown together a little too haphazardly. I will not redo the text, but I note that Alcaraz, per Wikipedia, went out in the fourth round, which is peanuts, comparatively speaking. Runner-Up, Nick Kyrgios, is too far down on the ranking to make a difference. Etc. The details of the below change a little to Djokovic’s disadvantage, but the main idea holds. (The overall comparison is also complicated by some opting not to participate in the Wimbledon to begin with, or being banned from doing so, like Medvedev. This could conceivably have had a larger pro-Djokovic effect.)

In a recent text, I discussed the artificial handicap given to Djokovic compared to e.g. Nadal in GOAT discussions through political meddling. The corresponding distortion through this year’s U.S. Open was lesser than I had feared, as Nadal neither won nor managed to get back to number one on the ranking. However, there is still a severe ranking effect:

The official ATP ranking currently* has a top-7 of:

*Note that this page is regularly updated. Data used represent the current state.

1 Carlos Alcaraz 6,740
2 Casper Ruud 5,850
3 Rafael Nadal 5,810
4 Daniil Medvedev 5,065
5 Alexander Zverev 5,040
6 Stefanos Tsitsipas 4,810
7 Novak Djokovic 3,570

Djokovic a lowly 7, even worse than before? Give Djokovic his 2,000 points for winning the Wimbledon, as he is already at 4, needing only 280 points to tie for second. What are the chances that he would have failed to gain more than 280 points combined over the Australian Open and the U.S. Open? Slim indeed. Getting to 1 is harder, as he would be missing 1,170 points. However, this could be achieved just by reaching the final (1,200 points) in one of the two majors, or by reaching the semi-final in both (2 x 720 points)—and we are talking about a man who won the one last year and reached the final in the other. (And this not counting any other tournaments in which he might have been disadvantaged, be it directly or through an artificially worsened seeding, cf. below.)

Of course, this ranking disadvantage does not just prevent him from improving his “days at number 1” statistic, it also implies a handicap in future tournaments, as he will be seeded worse than if he had been at 1 or 2. Then, again, we have the issue of the ATP Finals: his margin to remain in the top-8 is small indeed—and that is if he is even allowed to play, should he qualify.*

*I have not looked into details, but I would suspect that Djokovic has a larger number of points to defend during the autumn than most of the competition, which makes his chances even smaller.

All in all, this is just bullshit.

Looking at the current actual/official/whatnot number 1, Carlos Alcaraz: At 19, he is apparently the youngest in history and has, at least to me, come up out of nowhere.* In contrast, Félix Auger-Aliassime, who was hailed as a new superstar since his mid-teens, is old enough, at 22, to be at or shortly before his prime by historical standards, but he has achieved less, and appears to have just dropped from 8th to 13th on the ranking. The new number 2, Casper Ruud, is 23 and has also torn ahead relative Auger-Aliassime. Using the likes of the Big-3 as a comparison for Auger-Aliassime shows that he could have a great many years to prove himself; however, he is slowly reaching an age at which only a minority of the best-of-the-best, the Big-3 included, has failed to have a larger or considerably larger success. (Ages and ranking-drop according to the above rankings page.)

*But note that I have not followed tennis particularly closely the last few years.

Interestingly, members of the Big-3 have won three out of four majors this year, but we might still have seen the end of the Big-3 era. Federer is unlikely to ever make it back to the top and even Djokovic and Nadal must be approaching a day when age and accumulated wear-and-tear prove problematic. Going down the list, the next player of the same or higher age relative Djokovic/Nadal is a mere 32nd (Gael Monfils, at age 36).


Written by michaeleriksson

September 12, 2022 at 1:47 pm

Posted in Uncategorized

Tagged with , , , ,

Follow-up: Djokovic as GOAT? (III) and COVID distortions

with 5 comments

As a follow-up to an earlier text on Djokovic and COVID distortions:

The arguably best tennis player in the world, right here, right now,* is Djokovic. He was a single match away from winning a Grand Slam in 2021 and he has lost only one match in the majors this year (and that against Nadal in the French Open).

He still is no better than 6th on the world ranking; Nadal, his long-time rival, might retake the number one position on the world ranking presently; and Nadal might outdo him 3 majors to 1 for the year by the end of the ongoing 2022 U.S. Open. (Currently, 2 to 1.) To boot, Nadal might outdo him 23 to 21 overall. (Currently, 22 to 21.)

*Time of writing: September 3rd, 2022.

What is wrong with this picture? Well, firstly, looking at this year, Djokovic has been unfairly banned from two out of four majors (Australian Open,* U.S. Open) where he would have been the favorite** (and Nadal won the Australian Open in his absence, might do the same to the U.S. Open). Secondly, Djokovic’s Wimbledon victory gave him not one single point on the ATP ranking.*** All this for reasons of politics—not tennis.

*My original text, written during this tournament, speaks of the “on-going 2022 French Open”. This should, of course, be the “on-going 2022 Australian Open”.

**Very clearly so for the Australian Open; more narrowly for the U.S. Open.

***However, as unfortunate as this is for the sport of tennis, letting Wimbledon get away with blocking Russians (individual players are not party to the war) and Belorussians (even the country is not party to the war) because of the Russian invasion of the Ukraine would be a greater evil. Also note that the Wimbledon issue has a different cause than the other problems discussed.

(I have not looked into the non-majors, but there is a possibility that Djokovic has been similarly mistreated in other tournaments too. It can certainly not be ruled out that the unnecessary chaos, stress, and lost time has negatively affected him. Moreover, there is a non-trivial risk that he will be either unfairly banned from or unfairly fail to qualify for the ATP Finals, which would ruin his ranking further.)

Looking at the overall count of majors, this is partially caused by the disparities of 2022; however, these problems began earlier: the 2020 Wimbledon, where Djokovic was a clear favorite,* was canceled, while the French Open, won by Nadal, was merely postponed.

*He had won the two previous editions—and has gone on to win the two following.

I have written in detail about why the “majors won” heuristic for GOAT-hood and player comparisons is flawed ([1]). The extreme distortions over the last few years cement this—in a slightly different reality, Nadal might have two resp. three majors less (2020 French Open, 2022 Australian Open; 2022 U.S. Open), while Djokovic might have two resp. three majors more (2020 Wimbledon, 2022 Australian Open; 2022 U.S. Open). In this alternate reality, we would then see the current 22/23 vs. 21 change to 20 vs. 23/24.*

*With other alternate realities showing numbers in between. The point is that the current 22/23 vs. 21 is a clear and artificial distortion of historical greatness through the issues of the last few years.

The much more sensible “weeks at number one” heuristic is still clearly Djokovic’s, but his number is artificially diminished,* understating how great his career has been, while Nadal’s might soon be inflated, potentially giving him an unfair leg up against the likes of Sampras, Lendl, Connors.**

*Twofold: once for the reasons discussed here, which have caused him to be out of the top position when he likely otherwise would have held it; once through an earlier rankings’ freeze, where he did lead but his lead did not count in official statistics.

**But, to avoid misunderstandings, I would tend to give Nadal the nod over these past greats in a more holistic evaluation. Even a good heuristic is still a heuristic.

Excursion on different types of distortions:
Note that these distortions are not comparable with what misfortunes might take place through sheer bad luck (tends to even out over time; you win some, you lose some) and what is rooted in the person of the player (e.g. being disqualified for yelling at an official, being injury prone). Here we have distortions imposed by others, and in a manner that systematically disadvantages one player (or one group of players) relative the other players.

And, no, the bans based on vaccination status can not be put on Djokovic with an imbecilic “He should just get the vaccine! Then he could play!”: Apart from basic human decency and the right to medical self-determination, Djokovic’s decision is perfectly rational and reasonable, and he should be lauded for standing up for what is right: he already has immunity through prior COVID, a man of his age and fitness would be at minimal risk through COVID (as would the other players), and the risks of the vaccines are not trivial for an elite athlete. (Moreover, we know by now that the risks of COVID, in general, are far smaller than were claimed in 2020, and that current strains are even less dangerous than the early ones. Many measures that might have seemed reasonable in 2020 cannot be considered so in 2022.)

Written by michaeleriksson

September 3, 2022 at 3:48 pm

Posted in Uncategorized

Tagged with , , , ,

Djokovic as GOAT? (II) / Follow-up: Tennis, numbers, and reasoning

with one comment

As I suggested earlier this year, a strong case can be made for Djokovic as the GOAT of tennis. As he now has added another two majors, for a three-way tie with Federer and Nadal, even those obsessed with the flawed proxy of majors won (Cf. [1]) should slowly be caving.* This especially as his Wimbledon victories in 2019 and 2021 (this year) point to his being a clear favorite for 2020, had there been a tournament. In a reality just a little different, with Wimbledon postponed and the French Open canceled, Djokovic might lead 21 to 20 to 19 over Federer and Nadal.

*Except that tennis fans are often religious and might change criteria after the fact.

As to my own main proxy, weeks at number one, he has built a lead on Federer (while Nadal is not a factor) and will necessarily extend it further after his Wimbledon defense.

Moreover, as I wrote in [1]:*

*Footnotes removed for brevity.

The best way to proceed is almost certainly to try to make a judgment over an aggregate of many different measures, including majors won, ranking achievements, perceived dominance, length of career, … (And, yes, the task is near impossible.) For instance, look at the Wikipedia page on open era records in men’s singles and note how often Federer appears, how often he is the number one of a list, how often he is one of the top few, and how rarely his name does not appear in a significant list. That is a much stronger argument for his being the GOAT than “20 majors”. Similarly, it gives a decent argument for the Big Three being the top three of the open era; similarly, it explains why I would tend to view Djokovic as ahead of Nadal, and why I see it as more likely that Djokovic overtakes Federer than that Nadal does (in my estimate, not necessarily in e.g. the “has more majors” sense).

Look at the same page today, roughly two years later, and note how the distance between Federer and Djokovic has grown smaller or even reversed in various measures.

Should Djokovic add this year’s U.S. Open, winning the Grand Slam, this would probably close the debate for me. If he does not, I suspect that the developments over the next one or two years will leave the same conclusion. (But let us wait and see.)

Excursion on Federer as GOAT:
Now, if I were to argue Federer as GOAT, which is a position closer to my heart, I would probably rely on two things. Firstly, rivalries tend to favor the younger player, and will almost certainly have done so in the case of players this long-lived. This would give Federer a greater handicap from competing with the other “Big Three” than it does Djokovic and Nadal. Secondly, the great slowdown of surfaces has certainly favored the immensely strong defensive players and runners that are Djokovic and Nadal over Federer, who has a faster and more attacked based game. The downside of this argument, is that we cannot know how other players would have fared without a slowdown—and maybe all three would have seen their success diminished relative some even more attack based, and/or younger, and/or more canon-serving players. (Maybe, for instance, we would have had a four-way tie with Sampras at 14 in terms of majors won, with Sampras still leading in weeks at number one?)

Written by michaeleriksson

July 11, 2021 at 9:06 pm

Djokovic as GOAT? / Follow-up: Tennis, numbers, and reasoning

with 4 comments

In light of Djokovic now being set to overtake Federer in weeks-at-number-one and having just taken his 18th major, while Nadal has caught up with Federer at 20, it is time to briefly revisit a text on how to determine the tennis GOAT (Part II in a series)—or rather on why doing so is next to impossible.

As this blog is closed-ish, I will not dig deep into details or re-analyze what is said in the old text, but I do note that:

  1. I still consider weeks-at-number-one the best of the “easy” proxies. If we apply this proxy, Djokovic would (in just a few weeks) be the GOAT (of at least the Open Era). This especially as he is a fair bit younger than Federer (and a-year-or-so* younger than Nadal).

    *Here and elsewhere, note that I will not do any fact checking either. There might be minor errors here and there, but nothing that changes the “big picture”.

  2. I would still rate Federer’s career as the better overall, but not by that much and, again, Djokovic is the younger. Certainly, while Federer’s longevity is (was?) extreme, it appears that both Nadal and Djokovic are similar—possibly, even better.
  3. Federer’s dominance at his height was almost unsurpassable, and that might in the end be the strongest argument pro-Federer in a GOAT discussion and/or in a discussion of who was the best among the “Big Three”.
  4. Nadal’s fatal flaw remains that he has achieved too little (relatively speaking!) outside of clay and that he has mostly been second to either Federer or Djokovic at any given time. I can still see no true case for Nadal being more than the “Clay GOAT”. My old estimate of “Federer > Djokovic > Nadal” might now be “Federer = Djokovic > Nadal”, or Federer marginally ahead of Djokovic or Djokovic marginally ahead of Federer.

    However, Nadal has improved in the comparison of feats that formed Part III of the aforementioned series. The comparison made there was based on 12 French-Open titles, while he now stands at 13. (On the other hand, Djokovic reaching 9 Australian Opens, at a lesser age and on a more competitive surface, weakens the accomplishment in comparison.)

  5. The already tricky comparisons are made trickier by the effects of COVID, which include several weakened playing fields, including for Nadal’s 13th French Open and, maybe, the current/2021 Australian Open for Djokovic; a canceled Wimbledon (Djokovic reigning champion; Federer a strong victory candidate, had he played*); and a long period where the ATP ranking** was frozen or otherwise used exceptional rules.

    *Independent of the COVID issue, Federer appears to have taken portions of 2020 off for an injury break or operation or similar. I have not followed tennis in enough detail after 2019 to say for certain.

    **But I suspect that Djokovic would have remained at number one even with the regular rules, and would still be set to take over in weeks-at-number-one.

Skimming through the articles of the series, I note at least one faulty math statement (others might very well be present):

In Part I, I say that “For instance, the probability that the sum of two fair and six-sided dice exceeds* seven is 5/12 a priori but 5/6 given that we already know that one of the dice came up six.”, which is correct in the first half but not in the second: I had my mind on a scenario where one die (dice?) is thrown, it comes up six, and then the other die is thrown. As the order is not specified, another view is necessary. To this, there are 11 (independent) outcomes with at least one six, viz 1–6, 2–6, …, 5–6, 6–6, 6–5, .., 6–2, 6–1. Of these, all but two (6–1, 1–6) exceed seven and the true probability, barring other errors on my behalf, should be 9/11. Looking at the difference, 9/11 – 5/6 = (54 – 55) / 66 = – 1 / 66, making the new result slightly smaller. (The difference is an implicit, faulty, double-counting of 6–6, which unlike e.g. the 5–6/6–5 pair only appears once.)

*Used in the “strictly greater” sense. Another weakness is that this formulation could be interpreted as “greater or equal”. In the latter case, both the old and the new “given that” probability is 1, as the event is unavoidable. (The probability for the first half of the statement would rise to 7/12.)

Written by michaeleriksson

February 21, 2021 at 1:01 pm

Tennis, numbers, and reasoning: Part III

with 2 comments

Two post-scripts to the previous discussions ([1], [2]):

  1. In [1], I wrote

    Prime Federer’s feats are mind-numbing to those who understand the implications, including e.g. ten straight Grand-Slam finals with eight victories

    Nadal has since won his 12th (!!!) French Open—and was at eleven at the time of writing. How do these feats compare?

    This is a tricky question—and Nadal’s accomplishment undoubtedly is also one of the most amazing in tennis history.

    Overall, I would give Federer a clear nod when it comes to “mind-numbing”, because he has so many other stats that complement the specific one mentioned. This includes semi- and quarter-finals “in a row” statistics that are arguably even more impressive.

    When we look at these two specific feats, it is closer and the evaluation will likely be partially a matter of taste. Leaving probability theory out (in a first step), I would tend to favour Federer, because (a) he had a greater element of bad luck in that he ran into Nadal* on clay in the two finals that he lost, (b) had to compete on different surfaces, which makes it a lot harder, (c) the clay competition (Nadal, himself, aside) has been much weaker than the hard-court competition, (d) Federer reached the finals in his misses while Nadal fell well short of the finals. In Nadal’s favor, he had to span at least** twelve years of high level play, while Federer only needed*** two-and-a-half.

    *Nadal almost indisputably being the “clay-GOAT”, Federer likely being the number two clay player of the years in question, and the results possibly being misleading in the way that Mike Powell’s were in [2]. (Then again, some other complication might have arisen, e.g. had Federer played in another era.)

    **Assuming a twelve-in-a-row. As is, he has missed thrice and therefore needed a span of fifteen years.

    ***But note that his longevity has been extraordinary.

    From an idealized probabilities point-of-view, looking just at numbers and ignoring background information, we have to compare 8 out of 10 to 12 out of 15.* To get some idea, let us calculate the probability** of a tournament victory needed to have a 50 % chance of each of these feats. By the binomial formula, the chance of winning at least*** 8 out of 10 is p^10 + 10 * p^9 * (1 – p) + 45 * p^8 * (1 – p)^2, where p is the probability of winning a single tournament. This amounts to a p of approximately .74, i.e. a 74 % chance of winning any given major. Similarly, at least 12 out of 15 amounts to p^15 + 15 * p^14 * (1 – p) + 105 * p^13 * (1 – p)^2 + 455 * p^12 * (1 – p)^3 and a p of roughly 0.76 or a 76 % chance of winning any given French Open. In other words, the probabilities are almost the same, with Nadal very slightly ahead. (But note both the simplifying assumptions per footnote and that this is a purely statistical calculation that does not consider the “real world” arguments of the previous paragraph.) From another point of view, both constellations amount to winning 80 %, implying that someone with p = 0.8 would have had an expectation value of respectively 8 out of 10 and 12 out of 15.

    *The latter being Nadal’s record from his first win and participation in 2005 until the latest in 2019. In this comparison, I gloss over the fact that Nadal realistically only had one attempt, while Federer arguably had more than one. This especially because it would be very hard to determine the number of attempts for Federer, including questions like what years belonged to his prime (note that his statistic is a “prime effort” while Nadal’s is a “longevity effort”) and how “overlapping” attempts are to be handled. I also, this time to Federer’s disadvantage, gloss over the greater difficulty of reaching a final in a miss. (I.e. I treat a lost final as no better than even a first-round loss.) I am uncertain who is more favored by these simplifications.

    **Unrealistically assumed to be constant over each of the tournaments during the time period in question. This incidentally illustrates Federer’s had-to-face-Nadal-on-clay problem: Two French Opens belong to both series and would then have had both Federer and Nadal at considerably better than a 50 % chance of winning… (Both were, obviously, won by Nadal.)

    ***Winning nine or ten out of ten is a greater feat, but must be considered here. If not, eight out of ten might seem even harder than it actually is. (Exactly eight out of ten corresponds to the third term, for those who must know.)

    As a comparison, having a 74, 76, or 80 % (geometric average) chance of winning any individual match of a Grand-Slam tournament is quite good—and above we talk about the tournaments in their entirety.

  2. When I watched tennis in the mid-1980s, I was often puzzled by the way players would miss “simple” shots, e.g. a smash at the net—why not just hit the ball a little less hard and with more control?

    I did understand issues like nerves and over-thinking even back then; however, I had yet to understand the impact of probabilities: Hitting a safety shot reduces the risk of giving the point away—but it also gives the opponent a greater chance to keep the ball in play. When making judgments about what shot to make, a good compromise between these two factors have to be found, and that is what a good player tries* to do. Moreover, the difference in points won is often so small that surprisingly large risks can be justified. Consider e.g. a scenario where player A wins 55 % of rallies over player B. Now assume that he has the opportunity to hit a risky shot with a 35 % risk of immediate loss and a 65 % chance of immediate victory,** and the alternative of keeping the ball in play at the “old” percentages. Clearly, he should normally take the risk, because his chance of winning the point just rose by ten percentage points… It is true that he might look like a fool, should he fail, but it is the actual points that count.

    *I am not saying that the decision is always correct, a regard in which young me had a point, but there is more going on than just e.g. recklessness and over-confidence. The decision is also not necessarily conscious—much more often, I suspect, it is an unconscious or instinctual matter, based on many years of play and training.

    **Glossing over cases where the ball remains in play. I also assume, for simplicity, that there are no middle roads, e.g. hitting a safe shot that still manages to increase the probability of a rally win. Looking more in detail, we then have questions like whether hitting the ball a little harder or softer, going for a point closer to or farther from this-or-that line, whatnot, will increase or decrease the overall likelihood of winning the point.

    Similarly, I had trouble understanding the logic behind first and second serves: If a player’s First Serve* is “better” than his Second (which is what my grand-mother explained**), why not just use the same type of serve on the second serve? Vice versa, if his Second Serve actually was good enough to use on the second serve and safer than the First (again, per my grand-mother**), why is it not good enough for the first serve? Again, it is necessary to understand the involved probabilities (and the different circumstances of the first and second serve): A serve can have at least two relevant*** outcomes, namely a fault and a non-fault (which I will refer to as “successful” below). Successful serves, in turn, can be divided into those that ultimately lead to a point win (be it through an ace, a return error, or through later play) respectively a point loss. A fault leads to a second serve when faulting the first serve but a point loss (“double fault”) when faulting the second serve, which is the critical issue.

    *To avoid confusion, I capitalize “first serve” and “second serve” (and variations) when speaking of the actual execution (as in e.g. “Federer has a great First Serve”) and leave it uncapitalized when speaking of the classification by rule (as in e.g. “if a player faults his first serve, he has a second chance on his second serve”). Thus, normally, a player would use his First Serve on the first serve, but might theoretically opt to use his Second Serve instead, etc.

    **I am reasonably certain that these two explanations tapped out her own understanding: she was an adult and a tennis fan, but also far from a big thinker.

    ***A third, the “let”, is uninteresting for the math and outcomes, because it leads to a repeat with no penalty. I might forget some other special case.

    If we designate the probability* of a first serve being successful as p1s and ditto second serve p2s, and further put the respective probability of a point win given that the serve is successful at p1w respectively p2w, we can now put the overall probability of a point win (on serve) at p1s * p1w + (1 – p1s) * p2s * p2w. If using the same Serve, be it First or Second, for both serves, the formula simplifies to p1s * p1w * (2 – p1s) (or, equivalently, p2s * p2w * (2 – p2s)). A first obvious observation is that keeping the serves different gives a further degree of freedom, which makes it likely (but not entirely certain, a priori) that this is the better strategy. Looking more in detail at the formula, it is clear that the ideal second serve maximizes p2s * p2w, while the ideal first serve maximizes the overall formula given a value for p2s * p2w. Notably, an increase in p2s will have two expected effects, namely the tautological increase of the first factor and a diminishing of the second (p2w), because the lower risk of missing the serve will (in a typical, realistic scenario) come at the price of giving the opponent an easier task. An increase of p1s, on the other hand, will have three effects, those analogue to the preceding and a diminishing of the (1 – p1s) factor, which makes the optimal value for p1s smaller than for p2s.** In other words, the first serve should be riskier than the second.

    *Here simplifying (and unrealistic) assumptions are silently made, including that the probabilities are constant and that the player attempts the exact same serve on each occasion.

    **Barring the degenerate case of p2s * p2w = 0. If this expression has already been maximized, then p1s * p1w must also be = 0—and so must the overall formula. Further, unless p1w reacts pathologically to changes in p1s, e.g. flips to 0 whenever p1s < p2s. In such cases, p1s = p2s might apply. (But not p1s > p2s, because p1s * p1w is no larger than p2s * p2w, by assumption of optimization, while (1 – p1s) would then be smaller than (1 – p2s), implying that an increase of p1s above p2s lowers the overall value.)

    A more in depth investigation is hard without having a specific connection between the probabilities. To look at a very simplistic model, assume that we have an new variable r (“risk”) that runs from 0 to 1 and controls two functions ps(r) = 1 – r and pw(r) = r that correspond to the former p1s and p2s resp. p1w and p2w. (Note that the functions for “1” and “2” are the same, even if the old variables were kept separate.) We now want to choose an r1 and r2 for the first and second serve to maximize (1 – r1) * r1 + r1 * (1 – r2) * r2 (found by substitution in the original formula). The optimal value of r2 to maximize (1 – r2) * r2 can (regardless of r1) be found as 0.5, resulting in 0.25. The remaining expression in r1 is then (1 – r1) * r1 + 0.25 * r1 = 1.25 * r1 – r1^2, which maximizes for r1 = 0.625 with a value of 0.390625. In this specific case, the optimal first serve is, in some sense, two-and-a-half times as risky as the optimal second serve. (But note that this specific number need apply even remotely to real-life tennis: the functions were chosen to lead to easy calculations and illustration, not realism. This can be seen at the resulting chance of winning a point on one’s own serve being significantly smaller than 0.5…)

Written by michaeleriksson

June 25, 2019 at 8:53 am