Michael Eriksson's Blog

A Swede in Germany

Posts Tagged ‘Federer

Djokovic as GOAT? (II) / Follow-up: Tennis, numbers, and reasoning

with one comment

As I suggested earlier this year, a strong case can be made for Djokovic as the GOAT of tennis. As he now has added another two majors, for a three-way tie with Federer and Nadal, even those obsessed with the flawed proxy of majors won (Cf. [1]) should slowly be caving.* This especially as his Wimbledon victories in 2019 and 2021 (this year) point to his being a clear favorite for 2020, had there been a tournament. In a reality just a little different, with Wimbledon postponed and the French Open canceled, Djokovic might lead 21 to 20 to 19 over Federer and Nadal.

*Except that tennis fans are often religious and might change criteria after the fact.

As to my own main proxy, weeks at number one, he has built a lead on Federer (while Nadal is not a factor) and will necessarily extend it further after his Wimbledon defense.

Moreover, as I wrote in [1]:*

*Footnotes removed for brevity.

The best way to proceed is almost certainly to try to make a judgment over an aggregate of many different measures, including majors won, ranking achievements, perceived dominance, length of career, … (And, yes, the task is near impossible.) For instance, look at the Wikipedia page on open era records in men’s singles and note how often Federer appears, how often he is the number one of a list, how often he is one of the top few, and how rarely his name does not appear in a significant list. That is a much stronger argument for his being the GOAT than “20 majors”. Similarly, it gives a decent argument for the Big Three being the top three of the open era; similarly, it explains why I would tend to view Djokovic as ahead of Nadal, and why I see it as more likely that Djokovic overtakes Federer than that Nadal does (in my estimate, not necessarily in e.g. the “has more majors” sense).

Look at the same page today, roughly two years later, and note how the distance between Federer and Djokovic has grown smaller or even reversed in various measures.

Should Djokovic add this year’s U.S. Open, winning the Grand Slam, this would probably close the debate for me. If he does not, I suspect that the developments over the next one or two years will leave the same conclusion. (But let us wait and see.)

Excursion on Federer as GOAT:
Now, if I were to argue Federer as GOAT, which is a position closer to my heart, I would probably rely on two things. Firstly, rivalries tend to favor the younger player, and will almost certainly have done so in the case of players this long-lived. This would give Federer a greater handicap from competing with the other “Big Three” than it does Djokovic and Nadal. Secondly, the great slowdown of surfaces has certainly favored the immensely strong defensive players and runners that are Djokovic and Nadal over Federer, who has a faster and more attacked based game. The downside of this argument, is that we cannot know how other players would have fared without a slowdown—and maybe all three would have seen their success diminished relative some even more attack based, and/or younger, and/or more canon-serving players. (Maybe, for instance, we would have had a four-way tie with Sampras at 14 in terms of majors won, with Sampras still leading in weeks at number one?)

Advertisement

Written by michaeleriksson

July 11, 2021 at 9:06 pm

Djokovic as GOAT? / Follow-up: Tennis, numbers, and reasoning

with 4 comments

In light of Djokovic now being set to overtake Federer in weeks-at-number-one and having just taken his 18th major, while Nadal has caught up with Federer at 20, it is time to briefly revisit a text on how to determine the tennis GOAT (Part II in a series)—or rather on why doing so is next to impossible.

As this blog is closed-ish, I will not dig deep into details or re-analyze what is said in the old text, but I do note that:

  1. I still consider weeks-at-number-one the best of the “easy” proxies. If we apply this proxy, Djokovic would (in just a few weeks) be the GOAT (of at least the Open Era). This especially as he is a fair bit younger than Federer (and a-year-or-so* younger than Nadal).

    *Here and elsewhere, note that I will not do any fact checking either. There might be minor errors here and there, but nothing that changes the “big picture”.

  2. I would still rate Federer’s career as the better overall, but not by that much and, again, Djokovic is the younger. Certainly, while Federer’s longevity is (was?) extreme, it appears that both Nadal and Djokovic are similar—possibly, even better.
  3. Federer’s dominance at his height was almost unsurpassable, and that might in the end be the strongest argument pro-Federer in a GOAT discussion and/or in a discussion of who was the best among the “Big Three”.
  4. Nadal’s fatal flaw remains that he has achieved too little (relatively speaking!) outside of clay and that he has mostly been second to either Federer or Djokovic at any given time. I can still see no true case for Nadal being more than the “Clay GOAT”. My old estimate of “Federer > Djokovic > Nadal” might now be “Federer = Djokovic > Nadal”, or Federer marginally ahead of Djokovic or Djokovic marginally ahead of Federer.

    However, Nadal has improved in the comparison of feats that formed Part III of the aforementioned series. The comparison made there was based on 12 French-Open titles, while he now stands at 13. (On the other hand, Djokovic reaching 9 Australian Opens, at a lesser age and on a more competitive surface, weakens the accomplishment in comparison.)

  5. The already tricky comparisons are made trickier by the effects of COVID, which include several weakened playing fields, including for Nadal’s 13th French Open and, maybe, the current/2021 Australian Open for Djokovic; a canceled Wimbledon (Djokovic reigning champion; Federer a strong victory candidate, had he played*); and a long period where the ATP ranking** was frozen or otherwise used exceptional rules.

    *Independent of the COVID issue, Federer appears to have taken portions of 2020 off for an injury break or operation or similar. I have not followed tennis in enough detail after 2019 to say for certain.

    **But I suspect that Djokovic would have remained at number one even with the regular rules, and would still be set to take over in weeks-at-number-one.

Skimming through the articles of the series, I note at least one faulty math statement (others might very well be present):

In Part I, I say that “For instance, the probability that the sum of two fair and six-sided dice exceeds* seven is 5/12 a priori but 5/6 given that we already know that one of the dice came up six.”, which is correct in the first half but not in the second: I had my mind on a scenario where one die (dice?) is thrown, it comes up six, and then the other die is thrown. As the order is not specified, another view is necessary. To this, there are 11 (independent) outcomes with at least one six, viz 1–6, 2–6, …, 5–6, 6–6, 6–5, .., 6–2, 6–1. Of these, all but two (6–1, 1–6) exceed seven and the true probability, barring other errors on my behalf, should be 9/11. Looking at the difference, 9/11 – 5/6 = (54 – 55) / 66 = – 1 / 66, making the new result slightly smaller. (The difference is an implicit, faulty, double-counting of 6–6, which unlike e.g. the 5–6/6–5 pair only appears once.)

*Used in the “strictly greater” sense. Another weakness is that this formulation could be interpreted as “greater or equal”. In the latter case, both the old and the new “given that” probability is 1, as the event is unavoidable. (The probability for the first half of the statement would rise to 7/12.)

Written by michaeleriksson

February 21, 2021 at 1:01 pm

Tennis, numbers, and reasoning: Part III

with 2 comments

Two post-scripts to the previous discussions ([1], [2]):

  1. In [1], I wrote

    Prime Federer’s feats are mind-numbing to those who understand the implications, including e.g. ten straight Grand-Slam finals with eight victories

    Nadal has since won his 12th (!!!) French Open—and was at eleven at the time of writing. How do these feats compare?

    This is a tricky question—and Nadal’s accomplishment undoubtedly is also one of the most amazing in tennis history.

    Overall, I would give Federer a clear nod when it comes to “mind-numbing”, because he has so many other stats that complement the specific one mentioned. This includes semi- and quarter-finals “in a row” statistics that are arguably even more impressive.

    When we look at these two specific feats, it is closer and the evaluation will likely be partially a matter of taste. Leaving probability theory out (in a first step), I would tend to favour Federer, because (a) he had a greater element of bad luck in that he ran into Nadal* on clay in the two finals that he lost, (b) had to compete on different surfaces, which makes it a lot harder, (c) the clay competition (Nadal, himself, aside) has been much weaker than the hard-court competition, (d) Federer reached the finals in his misses while Nadal fell well short of the finals. In Nadal’s favor, he had to span at least** twelve years of high level play, while Federer only needed*** two-and-a-half.

    *Nadal almost indisputably being the “clay-GOAT”, Federer likely being the number two clay player of the years in question, and the results possibly being misleading in the way that Mike Powell’s were in [2]. (Then again, some other complication might have arisen, e.g. had Federer played in another era.)

    **Assuming a twelve-in-a-row. As is, he has missed thrice and therefore needed a span of fifteen years.

    ***But note that his longevity has been extraordinary.

    From an idealized probabilities point-of-view, looking just at numbers and ignoring background information, we have to compare 8 out of 10 to 12 out of 15.* To get some idea, let us calculate the probability** of a tournament victory needed to have a 50 % chance of each of these feats. By the binomial formula, the chance of winning at least*** 8 out of 10 is p^10 + 10 * p^9 * (1 – p) + 45 * p^8 * (1 – p)^2, where p is the probability of winning a single tournament. This amounts to a p of approximately .74, i.e. a 74 % chance of winning any given major. Similarly, at least 12 out of 15 amounts to p^15 + 15 * p^14 * (1 – p) + 105 * p^13 * (1 – p)^2 + 455 * p^12 * (1 – p)^3 and a p of roughly 0.76 or a 76 % chance of winning any given French Open. In other words, the probabilities are almost the same, with Nadal very slightly ahead. (But note both the simplifying assumptions per footnote and that this is a purely statistical calculation that does not consider the “real world” arguments of the previous paragraph.) From another point of view, both constellations amount to winning 80 %, implying that someone with p = 0.8 would have had an expectation value of respectively 8 out of 10 and 12 out of 15.

    *The latter being Nadal’s record from his first win and participation in 2005 until the latest in 2019. In this comparison, I gloss over the fact that Nadal realistically only had one attempt, while Federer arguably had more than one. This especially because it would be very hard to determine the number of attempts for Federer, including questions like what years belonged to his prime (note that his statistic is a “prime effort” while Nadal’s is a “longevity effort”) and how “overlapping” attempts are to be handled. I also, this time to Federer’s disadvantage, gloss over the greater difficulty of reaching a final in a miss. (I.e. I treat a lost final as no better than even a first-round loss.) I am uncertain who is more favored by these simplifications.

    **Unrealistically assumed to be constant over each of the tournaments during the time period in question. This incidentally illustrates Federer’s had-to-face-Nadal-on-clay problem: Two French Opens belong to both series and would then have had both Federer and Nadal at considerably better than a 50 % chance of winning… (Both were, obviously, won by Nadal.)

    ***Winning nine or ten out of ten is a greater feat, but must be considered here. If not, eight out of ten might seem even harder than it actually is. (Exactly eight out of ten corresponds to the third term, for those who must know.)

    As a comparison, having a 74, 76, or 80 % (geometric average) chance of winning any individual match of a Grand-Slam tournament is quite good—and above we talk about the tournaments in their entirety.

  2. When I watched tennis in the mid-1980s, I was often puzzled by the way players would miss “simple” shots, e.g. a smash at the net—why not just hit the ball a little less hard and with more control?

    I did understand issues like nerves and over-thinking even back then; however, I had yet to understand the impact of probabilities: Hitting a safety shot reduces the risk of giving the point away—but it also gives the opponent a greater chance to keep the ball in play. When making judgments about what shot to make, a good compromise between these two factors have to be found, and that is what a good player tries* to do. Moreover, the difference in points won is often so small that surprisingly large risks can be justified. Consider e.g. a scenario where player A wins 55 % of rallies over player B. Now assume that he has the opportunity to hit a risky shot with a 35 % risk of immediate loss and a 65 % chance of immediate victory,** and the alternative of keeping the ball in play at the “old” percentages. Clearly, he should normally take the risk, because his chance of winning the point just rose by ten percentage points… It is true that he might look like a fool, should he fail, but it is the actual points that count.

    *I am not saying that the decision is always correct, a regard in which young me had a point, but there is more going on than just e.g. recklessness and over-confidence. The decision is also not necessarily conscious—much more often, I suspect, it is an unconscious or instinctual matter, based on many years of play and training.

    **Glossing over cases where the ball remains in play. I also assume, for simplicity, that there are no middle roads, e.g. hitting a safe shot that still manages to increase the probability of a rally win. Looking more in detail, we then have questions like whether hitting the ball a little harder or softer, going for a point closer to or farther from this-or-that line, whatnot, will increase or decrease the overall likelihood of winning the point.

    Similarly, I had trouble understanding the logic behind first and second serves: If a player’s First Serve* is “better” than his Second (which is what my grand-mother explained**), why not just use the same type of serve on the second serve? Vice versa, if his Second Serve actually was good enough to use on the second serve and safer than the First (again, per my grand-mother**), why is it not good enough for the first serve? Again, it is necessary to understand the involved probabilities (and the different circumstances of the first and second serve): A serve can have at least two relevant*** outcomes, namely a fault and a non-fault (which I will refer to as “successful” below). Successful serves, in turn, can be divided into those that ultimately lead to a point win (be it through an ace, a return error, or through later play) respectively a point loss. A fault leads to a second serve when faulting the first serve but a point loss (“double fault”) when faulting the second serve, which is the critical issue.

    *To avoid confusion, I capitalize “first serve” and “second serve” (and variations) when speaking of the actual execution (as in e.g. “Federer has a great First Serve”) and leave it uncapitalized when speaking of the classification by rule (as in e.g. “if a player faults his first serve, he has a second chance on his second serve”). Thus, normally, a player would use his First Serve on the first serve, but might theoretically opt to use his Second Serve instead, etc.

    **I am reasonably certain that these two explanations tapped out her own understanding: she was an adult and a tennis fan, but also far from a big thinker.

    ***A third, the “let”, is uninteresting for the math and outcomes, because it leads to a repeat with no penalty. I might forget some other special case.

    If we designate the probability* of a first serve being successful as p1s and ditto second serve p2s, and further put the respective probability of a point win given that the serve is successful at p1w respectively p2w, we can now put the overall probability of a point win (on serve) at p1s * p1w + (1 – p1s) * p2s * p2w. If using the same Serve, be it First or Second, for both serves, the formula simplifies to p1s * p1w * (2 – p1s) (or, equivalently, p2s * p2w * (2 – p2s)). A first obvious observation is that keeping the serves different gives a further degree of freedom, which makes it likely (but not entirely certain, a priori) that this is the better strategy. Looking more in detail at the formula, it is clear that the ideal second serve maximizes p2s * p2w, while the ideal first serve maximizes the overall formula given a value for p2s * p2w. Notably, an increase in p2s will have two expected effects, namely the tautological increase of the first factor and a diminishing of the second (p2w), because the lower risk of missing the serve will (in a typical, realistic scenario) come at the price of giving the opponent an easier task. An increase of p1s, on the other hand, will have three effects, those analogue to the preceding and a diminishing of the (1 – p1s) factor, which makes the optimal value for p1s smaller than for p2s.** In other words, the first serve should be riskier than the second.

    *Here simplifying (and unrealistic) assumptions are silently made, including that the probabilities are constant and that the player attempts the exact same serve on each occasion.

    **Barring the degenerate case of p2s * p2w = 0. If this expression has already been maximized, then p1s * p1w must also be = 0—and so must the overall formula. Further, unless p1w reacts pathologically to changes in p1s, e.g. flips to 0 whenever p1s < p2s. In such cases, p1s = p2s might apply. (But not p1s > p2s, because p1s * p1w is no larger than p2s * p2w, by assumption of optimization, while (1 – p1s) would then be smaller than (1 – p2s), implying that an increase of p1s above p2s lowers the overall value.)

    A more in depth investigation is hard without having a specific connection between the probabilities. To look at a very simplistic model, assume that we have an new variable r (“risk”) that runs from 0 to 1 and controls two functions ps(r) = 1 – r and pw(r) = r that correspond to the former p1s and p2s resp. p1w and p2w. (Note that the functions for “1” and “2” are the same, even if the old variables were kept separate.) We now want to choose an r1 and r2 for the first and second serve to maximize (1 – r1) * r1 + r1 * (1 – r2) * r2 (found by substitution in the original formula). The optimal value of r2 to maximize (1 – r2) * r2 can (regardless of r1) be found as 0.5, resulting in 0.25. The remaining expression in r1 is then (1 – r1) * r1 + 0.25 * r1 = 1.25 * r1 – r1^2, which maximizes for r1 = 0.625 with a value of 0.390625. In this specific case, the optimal first serve is, in some sense, two-and-a-half times as risky as the optimal second serve. (But note that this specific number need apply even remotely to real-life tennis: the functions were chosen to lead to easy calculations and illustration, not realism. This can be seen at the resulting chance of winning a point on one’s own serve being significantly smaller than 0.5…)

Written by michaeleriksson

June 25, 2019 at 8:53 am

Interesting sports events

leave a comment »

There have been a few recent sports events that have been more interesting to me outside of sports than within:

Firstly, the European Championships in handball: During the time when I was the most interested in sports (late teens or so), Sweden was one of the world’s leading handball nations, often dueling it out with Russia. These days are long gone, and the world has changed sufficiently that Sweden’s smaller neighbor Denmark, an absolute nobody back then, is the reigning Olympic champion—something that teenage me would likely have considered an absurdity, even an insult, seeing that Sweden has racked up four silver medals without ever reaching the gold.

In the first game of the European Championships came the ultimate blow: A humiliating loss against dwarf country Iceland… I wrote off the rest of the Championship, reflecting on how sadly similar things had happened in tennis and table tennis, and noting how well this matched some of my thoughts on how short-lived traditions actually often are and how the world can change from what we know in our formative days. (Cf. my Christmas post.)

Today, Sweden played in the final of the same Championships against Spain, even having a half-time lead and an apparent good chance at victory. (Before, regrettably, losing badly in the second half. Still, a silver is far beyond what seemed possible after the Iceland game and a very positive sign for the future.) The road there was very odd, including the paradox of an extremely narrow semi-final win over Denmark, the aforementioned Olympic Champions, and another embarrassing and unnecessary loss against a smaller neighbor in Norway. Funny thing, sports.

Secondly, the immensity of Roger Federer’s 20th Grand-Slam title. A year ago, he and Nadal met up in the final of the Australian Open for what seemed like their last big hurrah—one of them was going to get a last title before age or injuries ended their competitive careers. Since Federer’s narrow win, we have seen another four Grand-Slam tournaments—with the winners Nadal, Federer, Nadal, and (with this year’s Australian Open) Federer. Indeed, where a year ago I was thrilled over the (presumed) last win, I was now slightly annoyed that Federer narrowly* missed going through the tournament without a loss of set. This is a very good illustration of how humans tend never to be satisfied, to ever want more or better**, and of how our baseline for comparisons can change.

*He entered the final without a lost set, won sets one and three, and only missed the second in a tie break. One or two points more and he would have had it. Such a result is extremely rare. (The oddity of 2017 notwithstanding, where it actually happened twice, making the year the more remarkable: Nadal in the French Open and Federer a few weeks later in Wimbledon.)

**Whether this is a good or bad thing will depend on the circumstances and on whether this tendency leaves us unhappy or not. At any rate, humanity would hardly have gotten to where it is without this drive.

An interesting lesson is the importance of adapting to new circumstances: Apparently, Federer has spent considerable time modifying his approach* to tennis in order to remain reasonably healthy and competitive even at his ancient-by-tennis-standards age of 36. Those who stand still fall behind (generally) and we all do well to adapt to counter aging (specifically).

*In a number of areas including style of play, racket size, and yearly schedule.

Written by michaeleriksson

January 28, 2018 at 11:15 pm