## Tennis, numbers, and reasoning: Part I

Preamble: This and a following text were intended as a single, not that long, piece. Because the length of the first part grew out of hand, I decided to split the text into (at least) two parts. Beware that a mixture of time constraints and the growing-out-hand left me lazy with the math—there might be errors through lack of checking that change the details (but not the principle), and there is a lack of explanation. (However, the math is not more advanced than what many high-schoolers encounter.) Note that I use the convention of ^ to indicate exponentiation, e.g. 2^3 = 2 * 2 * 2 = 8, and that “*” might be displayed oddly for technical reasons. (I normally use it only to indicate footnotes, and have not bothered to implement e.g. a math mode in my markup.)

With the latest French Open reaching its deciding phase, I have been reading a bit about tennis. A few resulting observations on tennis, numbers, and reasoning:

(Part I)

There is very little understanding of how probabilities play in when it comes to e.g. who-beats-whom, what is and is not impressive, whatnot. Notably, even many hard-core fans seem to jump to odd conclusions about superiority, inferiority, or who is too past his prime to be reckoned with *based on a single* match*. This is highly naive, even when we discount questions like surface preferences, off days, and whatnot.

*Note: “single”, not “singles”.

Consider a hypothetical match-up, where two players (A and B) are so close in abilities that the winner of each individual set is a 50–50 matter. Even in a best-of-five setting, this leaves player A with a one-in-eight chance of a straight set victory—and ditto player B. In other words, there is a quarter chance, that the match will be decided in only three sets and who wins is a toss up. Correspondingly, a single straight set victory does not necessarily say anything about the involved players. In a best-of-three-setting, half of the matches would be straight set victories and who wins is, again, a toss up.

What *can* be done is to look at “Bayesian probabilities”*, i.e. try to determine the probability of something based on observed events. Given that player A beat player B, we can *suspect* that his chance of winning is higher. Certainly, if the probabilities of a set win are shifted from 50–50 to 90–10, this would also normally result in player A winning, while a 10–90 shift would typically leave player B as the winner. (But note that even a 90–10 scenario can result in an upset, especially in best-of-three.) To get reliable information from such considerations, however, a fairly large data set can be needed, as in repeated meetings or a clear superiority in terms of *games* or *points* won in a single match (but not just the match it self or the sets of the match; of course, any single-match evaluation is prone to other weaknesses, like ignoring the possibility of a single “bad day”).

*Going into details would go past the high-school level and, frankly, I might need to refresh my own memory. The principle, however, is that (a) the probability of X and X-given-that-Y are not (necessarily) the same, (b) suitable choices allow us to e.g. calculate an expectation value for an unknown probability. For instance, the probability that the sum of two fair and six-sided dice exceeds seven is 5/12 a priori but 5/6 given that we already know that one of the dice came up six. For instance, if this sum exceeds seven at a different ratio than 5/12 over a great number of repetitions, we might conclude that one or both dice are not fair, and even attempt to estimate new probabilities for the individual sides of the dice. The “reasoning” used when it comes to some tennis “experts” could be seen as a highly naive misapplication of this, viz. that “A beat B; ergo, the probability of A beating B is 100 %; ergo, A will always beat B”.

As a notable example, let us look at the one official meeting between Pete Sampras and Roger Federer:

According to an archived version of official statistics, Federer and Sampras won respectively 1 and 0 matches (100–0), 3 and 2 sets (60–40), 31 and 29 games* (51.67–48.33), and 190 and 180 points (51.35–48.65).

*Including a tie-break each. Subtracting tie-breaks, we have 30 vs 28 and virtually the same percentages. Note that the set–game difference is likely increased and the game–point difference diminished through alternating service games (as opposed to e.g. alternating serve after each point).

Looking at the overall match, it tells us next to nothing. Indeed, had but one or two points gone differently, it might have been Sampras winning.* The games tells us a little more, but still nothing that could not easily be the product of chance. Only the points give us some truer indication (despite having the *smallest* relative difference)–but even that could be a product of chance or, e.g., some difference** in playing style or point distribution that is of little import.

*At least one example is obvious without looking at the individual development: Federer won the first set tie-break 9–7. Switch two points around and Sampras would, all other things equal, have won the match 3–1 (a somewhat clear victory to the naive eye). Switch one around and he would have had a roughly 50 % chance of winning from 8–8, and there might have been some earlier point in the tie-break, where even a single point would have handed him e.g. a 7–5.

**Consider e.g. a scenario where a player who already is a break up prefers to not fight back on his opponents serve, in order to save himself for the next set. (Whether such factors applied in this specific match, I leave unstated.)

This was a genuinely close match and even just looking at the game score, this should be obvious. (Nevertheless, I have seen this match cited as proof that Federer was better* than Sampras—notwithstanding factors like that none of them were in their primes.) Still, the margins on the point level are often fairly small and can still result in notable differences in overall results. For instance, imagine a 0.55 (i.e. 55 %) probability of winning any individual *point***, and see how this scales. Winning a point is (tautologically) a 55–45 proposition and the result of a point played will tell us next to nothing (but the score over one hundred, two hundred, three hundred, …, points will be increasingly telling). If we assume that a game is played as best-of-five points,*** we now have a probability of 1 * 0.55^5 + 5 * 0.55^4 * 0.45^1 + 10 * 0.55^3 * 0.45^2 = .5931268750 or roughly 3/5 that player A wins an individual game (per the binomial formula). The difference in game-winning percentage is then almost doubled compared to the point-winning difference. If we now approximate a set as best-of-nine games****, the binomial formula gives roughly a .7189 chance of player A winning a set. Applying this to matches determined by best-of-three and best-of-five sets,***** we then have a match winning probability of roughly .8074 respectively .8610.

*This is another case of my disagreeing with the *reasoning* behind a claim—not necessarily the claim it self.

**Glossing over the complication that the probabilities will vary widely depending on who serves.

***This is not the case, nor is it necessarily a very realistic approximation. I considered making a more elaborate model, but deemed it too much work for a demonstration of principle. The best-of-five approximation is easy to calculate and requires no deeper modeling. To boot, it is likely to *understate* the difference that I try to show, which makes it more acceptable; to boot, the simplifications of ignoring serves might be the larger error, had I intended to find more exact numbers (rather than demonstrate the principle); to boot, any model of a tennis game that involves fix probabilities for all points (ignoring e.g. their relative importance, tiredness, nerves, …) is inherently simplistic. (An approximation as best-of-six might have been better, but would have involved the possibility of a draw, while best-of-seven might have *overstated* the difference.)

****Similar remarks apply.

*****Here the modeling is exact, because matches *are* played as best-of-three and best-of-five sets.

From another point of view, consider claims like “player A would not be able to take a game of player B”. Even when this applies to a *typical* match, it does not (or only very, very rarely) apply categorically over all matches played between them–again for statistical* reasons. Assume that player A is so much worse that he virtually never wins a point in his opponents service games and a mere 20 % of points in his own service games (making 15–60 a typical score for an own service game). This still gives him a chance of 1/5^4 or one in 625 to win any of his service games *to love* and .05792 or roughly 1/17 to win it *at all* by the above best-of-five model. This model might overstate the probability in this case, but if we say 1/30 as a rough guesstimate, and factor in that he would have at least three opportunities to serve per set, he would likely win a game roughly once every three best-of-five** or once every five best-of-three** matches. With a less disastrous difference, the odds improve correspondingly.

*Even discounting factors like player B gifting a game to be kind, player B having a sudden cramp, whatnot.

**Note that this translates to playing (three times) three resp. (five times) two sets under the assumptions made, because he would need absurd luck not to loose in straight sets.

This type of thinking demonstrates how unbelievable some of the exploits of the all-time greats are. For instance, to win forty straight matches requires an enormous superiority over the average opponent (and/or a ridiculous amount of luck). Prime Federer’s feats are mind-numbing to those who understand the implications, including e.g. ten straight Grand-Slam finals with eight victories—the full, mythical Grand Slam (i.e. all four tournaments won in the same year) is a considerably lesser accomplishment.

Excursion on other sports:

Some of the above applies equally to some or most other sports, e.g. the impressiveness of victories in a row. For instance, if an athlete or a team has a geometric average chance of 95* % of winning any individual competition (e.g. a tennis, boxing, or basket-ball match), the chance of winning ten in a row is 0.95^10 or roughly three in five, twenty in a row carries just a little more than a one in three chance, and forty in a row roughly one in eight. To have an at least 50 % chance at forty in a row, an individual probability of better than 98.28** % is required. Other parts do not apply, due to the unusual scoring (where e.g. a basket-ball game leaves the higher scorer the victor, while a tennis match might see the party with fewer points take the match).

*Note that this is a very high number, seeing that it must last for some time, is vulnerable to external conditions, must cover the risk of injury, etc. Moreover, the geometric average is more sensitive to outliers than the regular arithmetic average. For instance, playing seven opponents with an individual 99 % chance of victory and a single toss-up opponent gives a geometric average of less than 91 % but an arithmetic of 92.875 %.

**To understand how high this number is, note that it cuts the opponents chance of winning down to a little more than third of what it is for 0.95—an already very high number.

Excursion on probabilities, upsets, and the oddities of score keeping:

It might seem paradoxical that the score keeping used in tennis increases the difference in score compared to a plain point counting, e.g. as with Federer–Sampras above, while also increasing the probability of upsets. This, however, is easy to understand by considering the games and sets a division of smaller somewhat independent events into larger somewhat independent events. A reasonable analogy is a “plain” election system vs. a “first past the post” system.

This weakness to upsets is arguably a part of the charm of tennis, but it is a strong argument in favor of keeping important men’s matches at five sets and to introduce them among the women too.

[…] To continue the previous part: […]

Tennis, numbers, and reasoning: Part II | Michael Eriksson's BlogJune 9, 2019 at 12:17 am

[…] post-scripts to the previous discussions ([1], […]

Tennis, numbers, and reasoning: Part III | Michael Eriksson's BlogJune 25, 2019 at 8:53 am