## Tennis, numbers, and reasoning: Part III

Two post-scripts to the previous discussions ([1], [2]):

- In [1], I wrote

Prime Federer’s feats are mind-numbing to those who understand the implications, including e.g. ten straight Grand-Slam finals with eight victories

Nadal has since won his 12th (!!!) French Open—and was at eleven at the time of writing. How do these feats compare?

This is a tricky question—and Nadal’s accomplishment undoubtedly is also one of the most amazing in tennis history.

*Overall*, I would give Federer a clear nod when it comes to “mind-numbing”, because he has so many other stats that complement the specific one mentioned. This includes semi- and quarter-finals “in a row” statistics that are arguably even more impressive.When we look at these two specific feats, it is closer and the evaluation will likely be partially a matter of taste. Leaving probability theory out (in a first step), I would tend to favour Federer, because (a) he had a greater element of bad luck in that he ran into Nadal* on clay in the two finals that he lost, (b) had to compete on different surfaces, which makes it a lot harder, (c) the clay competition (Nadal, himself, aside) has been much weaker than the hard-court competition, (d) Federer reached the finals in his misses while Nadal fell well short of the finals. In Nadal’s favor, he had to span

*at least*** twelve years of high level play, while Federer only needed*** two-and-a-half.*Nadal almost indisputably being the “clay-GOAT”, Federer likely being the number two clay player of the years in question, and the results possibly being misleading in the way that Mike Powell’s were in [2]. (Then again, some other complication might have arisen, e.g. had Federer played in another era.)

**Assuming a twelve-in-a-row. As is, he has missed thrice and therefore needed a span of fifteen years.

***But note that his longevity has been extraordinary.

From an idealized probabilities point-of-view, looking just at numbers and ignoring background information, we have to compare 8 out of 10 to 12 out of 15.* To get some idea, let us calculate the probability** of a tournament victory needed to have a 50 % chance of each of these feats. By the binomial formula, the chance of winning at least*** 8 out of 10 is p^10 + 10 * p^9 * (1 – p) + 45 * p^8 * (1 – p)^2, where p is the probability of winning a single tournament. This amounts to a p of approximately .74, i.e. a 74 % chance of winning any given major. Similarly, at least 12 out of 15 amounts to p^15 + 15 * p^14 * (1 – p) + 105 * p^13 * (1 – p)^2 + 455 * p^12 * (1 – p)^3 and a p of roughly 0.76 or a 76 % chance of winning any given French Open. In other words, the probabilities are almost the same, with Nadal very slightly ahead. (But note both the simplifying assumptions per footnote and that this is a purely statistical calculation that does not consider the “real world” arguments of the previous paragraph.) From another point of view, both constellations amount to winning 80 %, implying that someone with p = 0.8 would have had an expectation value of respectively 8 out of 10 and 12 out of 15.

*The latter being Nadal’s record from his first win and participation in 2005 until the latest in 2019. In this comparison, I gloss over the fact that Nadal realistically only had one attempt, while Federer arguably had more than one. This especially because it would be very hard to determine the number of attempts for Federer, including questions like what years belonged to his prime (note that his statistic is a “prime effort” while Nadal’s is a “longevity effort”) and how “overlapping” attempts are to be handled. I also, this time to Federer’s disadvantage, gloss over the greater difficulty of reaching a final in a miss. (I.e. I treat a lost final as no better than even a first-round loss.) I am uncertain who is more favored by these simplifications.

**Unrealistically assumed to be constant over each of the tournaments during the time period in question. This incidentally illustrates Federer’s had-to-face-Nadal-on-clay problem: Two French Opens belong to both series and would then have had

*both*Federer and Nadal at considerably better than a 50 % chance of winning… (Both were, obviously, won by Nadal.)***Winning nine or ten out of ten is a greater feat, but must be considered here. If not, eight out of ten might seem even harder than it actually is. (Exactly eight out of ten corresponds to the third term, for those who must know.)

As a comparison, having a 74, 76, or 80 % (geometric average) chance of winning any individual

*match*of a Grand-Slam tournament is quite good—and above we talk about the tournaments in their entirety. - When I watched tennis in the mid-1980s, I was often puzzled by the way players would miss “simple” shots, e.g. a smash at the net—why not just hit the ball a little less hard and with more control?
I did understand issues like nerves and over-thinking even back then; however, I had yet to understand the impact of probabilities: Hitting a safety shot reduces the risk of giving the point away—but it also gives the opponent a greater chance to keep the ball in play. When making judgments about what shot to make, a good compromise between these two factors have to be found, and that is what a good player tries* to do. Moreover, the difference in points won is often so small that surprisingly large risks can be justified. Consider e.g. a scenario where player A wins 55 % of rallies over player B. Now assume that he has the opportunity to hit a risky shot with a 35 % risk of immediate loss and a 65 % chance of immediate victory,** and the alternative of keeping the ball in play at the “old” percentages. Clearly, he should normally take the risk, because his chance of winning the point just rose by ten percentage points… It is true that he might look like a fool, should he fail, but it is the actual points that count.

*I am not saying that the decision is always correct, a regard in which young me had a point, but there is more going on than just e.g. recklessness and over-confidence. The decision is also not necessarily conscious—much more often, I suspect, it is an unconscious or instinctual matter, based on many years of play and training.

**Glossing over cases where the ball remains in play. I also assume, for simplicity, that there are no middle roads, e.g. hitting a safe shot that still manages to increase the probability of a rally win. Looking more in detail, we then have questions like whether hitting the ball a little harder or softer, going for a point closer to or farther from this-or-that line, whatnot, will increase or decrease the overall likelihood of winning the point.

Similarly, I had trouble understanding the logic behind first and second serves: If a player’s First Serve* is “better” than his Second (which is what my grand-mother explained**), why not just use the same type of serve on the second serve? Vice versa, if his Second Serve actually was good enough to use on the second serve

*and*safer than the First (again, per my grand-mother**), why is it not good enough for the first serve? Again, it is necessary to understand the involved probabilities (and the different circumstances of the first and second serve): A serve can have at least two relevant*** outcomes, namely a fault and a non-fault (which I will refer to as “successful” below). Successful serves, in turn, can be divided into those that ultimately lead to a point*win*(be it through an ace, a return error, or through later play) respectively a point*loss*. A fault leads to a second serve when faulting the first serve but a point loss (“double fault”) when faulting the second serve, which is the critical issue.*To avoid confusion, I capitalize “first serve” and “second serve” (and variations) when speaking of the actual execution (as in e.g. “Federer has a great First Serve”) and leave it uncapitalized when speaking of the classification by rule (as in e.g. “if a player faults his first serve, he has a second chance on his second serve”). Thus, normally, a player would use his First Serve on the first serve, but might theoretically opt to use his Second Serve instead, etc.

**I am reasonably certain that these two explanations tapped out her own understanding: she was an adult and a tennis fan, but also far from a big thinker.

***A third, the “let”, is uninteresting for the math and outcomes, because it leads to a repeat with no penalty. I might forget some other special case.

If we designate the probability* of a first serve being successful as p1s and ditto second serve p2s, and further put the respective probability of a point win

*given that the serve is successful*at p1w respectively p2w, we can now put the overall probability of a point win (on serve) at p1s * p1w + (1 – p1s) * p2s * p2w. If using the same Serve, be it First or Second, for both serves, the formula simplifies to p1s * p1w * (2 – p1s) (or, equivalently, p2s * p2w * (2 – p2s)). A first obvious observation is that keeping the serves different gives a further degree of freedom, which makes it likely (but not entirely certain, a priori) that this is the better strategy. Looking more in detail at the formula, it is clear that the ideal second serve maximizes p2s * p2w, while the ideal first serve maximizes the overall formula given a value for p2s * p2w. Notably, an increase in p2s will have two expected effects, namely the tautological increase of the first factor and a diminishing of the second (p2w), because the lower risk of missing the serve will (in a typical, realistic scenario) come at the price of giving the opponent an easier task. An increase of p1s, on the other hand, will have*three*effects, those analogue to the preceding and a diminishing of the (1 – p1s) factor, which makes the optimal value for p1s smaller than for p2s.** In other words, the first serve should be riskier than the second.*Here simplifying (and unrealistic) assumptions are silently made, including that the probabilities are constant and that the player attempts the exact same serve on each occasion.

**Barring the degenerate case of p2s * p2w = 0. If this expression has already been maximized, then p1s * p1w must also be = 0—and so must the overall formula. Further, unless p1w reacts pathologically to changes in p1s, e.g. flips to 0 whenever p1s < p2s. In such cases, p1s = p2s might apply. (But not p1s > p2s, because p1s * p1w is no larger than p2s * p2w, by assumption of optimization, while (1 – p1s) would then be smaller than (1 – p2s), implying that an increase of p1s above p2s lowers the overall value.)

A more in depth investigation is hard without having a specific connection between the probabilities. To look at a very simplistic model, assume that we have an new variable r (“risk”) that runs from 0 to 1 and controls two functions ps(r) = 1 – r and pw(r) = r that correspond to the former p1s and p2s resp. p1w and p2w. (Note that the functions for “1” and “2” are the same, even if the old variables were kept separate.) We now want to choose an r1 and r2 for the first and second serve to maximize (1 – r1) * r1 + r1 * (1 – r2) * r2 (found by substitution in the original formula). The optimal value of r2 to maximize (1 – r2) * r2 can (regardless of r1) be found as 0.5, resulting in 0.25. The remaining expression in r1 is then (1 – r1) * r1 + 0.25 * r1 = 1.25 * r1 – r1^2, which maximizes for r1 = 0.625 with a value of 0.390625. In this specific case, the optimal first serve is, in some sense, two-and-a-half times as risky as the optimal second serve. (But note that this specific number need apply even remotely to real-life tennis: the functions were chosen to lead to easy calculations and

*illustration*, not realism. This can be seen at the resulting chance of winning a point on one’s own serve being significantly smaller than 0.5…)

## Leave a Reply