Michael Eriksson's Blog

A Swede in Germany

Archive for the ‘Uncategorized’ Category

Journalistic fraud II

leave a comment »

Yesterday, I published a text on gross journalistic fraud; today, I am met with news sources claiming that RTL* has discovered at least seven cases of deliberate manipulation by one of its employees**… According to e.g. [1] (in German), the proofs are sufficiently clear that the employee has been summarily fired. Further checks of work stretching back twelve years is under way.

*One of the largest German TV senders.

**Original sources use “Mitarbeiter”, which is vaguer than “employee” and might well refer to a non-employed collaborator. Depending on (unknown) context, another translation might be better.

While these individual cases do not necessarily say anything about the typical reporting,* they are a very bad sign—and they do make clear that we must not “believe everything written in the paper”, be it literally or metaphorically. Moreover, they point to a considerable need for media to improve its fact-checking.

*There are thousands of journalists, TV reporters, and whatnots active on a daily basis in Germany alone. Even a small percentage of fraudsters will lead to a non-trivial number of cases.

Advertisements

Written by michaeleriksson

June 14, 2019 at 5:47 pm

Posted in Uncategorized

Tagged with , , , ,

Journalistic fraud

with one comment

As a frequent* critic of journalism and journalists, I surprisingly managed to miss one of the biggest journalistic disasters in decades—the outright, large scale fraud perpetrated by Der Spiegel reporter and repeated award-winner Claas Relotius. (See e.g. an extensive Der Spiegel text [1] and other texts linked from there, as well as English Wikipedia [2] and German Wikipedia [3].)

*See e.g. [4], [5], [6], [7], [8], [9].

Notwithstanding that I am half a year late to the party, there is plenty here that I wish to discuss, especially in the light of the “Lügenpresse”* and “fake news” controversies.

*A derogatory German word for the press, often used by populists. A reasonable literal translation is “press of lies”; a more idiomatically plausible, “liar press”.

Until now, I have considered “Lügenpresse” to be mostly a misattribution of intention, where the true issue is not deliberate lies but a mixture of differences of opinion, the indisputable ideological slant of too many journalists, and the ever-manifesting absurd incompetence of journalists—a failure to apply Hanlon’s razor by those critical of journalism. Events like these make me wonder. Is this a single, regrettable instance*, or is it just the top of the ice-berg?

*A cliched, almost knee-jerk claim by German organizations, when exposed to criticism, is that a particular problem is a “bedauerlicher Einzelfall”—this even when there is good reason to believe that the problem is either occurring often or a sign of a more pervasive underlying problem.

Journalistic fraud does exist, two of the more well-known instances being Tribal Rites of the New Saturday Night and the scandal around Stern* and Hitler’s diaries, and I see at least two causes why it might be relatively common today: Firstly, the move towards more free-lancing journalists and the need to be published to earn money, including that some online sources (e.g. Slant) pay writers based on clicks on their articles. If fiddling with the facts, or even outright invention, brings a better chance of being paid and/or more payment, then it can safely be assumed that some** will cheat. Secondly, many journalists are strongly Left-leaning, engaged in the PC movement, and/or see themselves as world-improvers. Looking at such people elsewhere, notably in the blogosphere and in general politics, they often have a “the end justifies the means” attitude and are very prejudiced about what their opponents and perceived enemies actually believe, say, do, whatnot. As with them, it would be unsurprising if some journalists fell for the temptation to fiddle with the facts to show the “truth”—e.g. that a “known” xenophobe who denies being one is “revealed” through some fake statements or untrue allegations.*** In addition, there is great reason to believe that “artistic liberties” are common, e.g. through exaggeration, presenting speculation as fact, oversimplification, paraphrasing while claiming to quote (also see an excursion), misleading translation,**** quoting out of context, use of unattributed and unverified material from others to flesh out own researches, … While these are typically less harmful, they often still leave readers with a faulty impression and are highly disputable from an ethical point of view.

*Another German magazine and one of Der Spiegel’s main competitors.

**How many is a very different question, and only speculation is possible without actual investigation. I do note, however, that some press reactions mentioned in [3] could point to a fairly large problem, including that Georg Altrogge claims that Der Spiegel could have provided fertile ground (“Nährboden”) for cheats through its story-telling attitude, that Michael Hopp admits to having cheated extensively (“immer viel”) himself, that Dirk Gieselmann (another award-winner) has been fired from several magazines, …

***This not to be confused with blanket claims that e.g. “X is a xenophobe”, “X is extreme Right”, etc., which do abound but might be explainable through prejudice or ignorance even when actually incorrect. These too are a problem, but they are not necessarily fraudulent. To boot, many U.S. claims about e.g. “racist” are rooted in a lack of understanding what “racist” means, including confusing it with “racial”. (A good example is a reference to the German AfD as “far Right” that I saw while reading up for this text. The claim is at best an exaggeration, even the uselessness of the term “Right” aside, but might well be explainable by the foreign source being ignorant of German politics and/or simply having uncritically listened to one of AfDs many, mostly Leftist, detractors.)

****During the invasion of Iraq (when I still occasionally watched TV), the German news senders often distorted English originals. My memories are understandably vague, but consider e.g. a scenario where a U.S. spokesman says “we do not know” one day, which is rendered as “the U.S. denies”, and the spokesman says “now we know—and it is true” a few days later, which is then rendered as “the U.S. has been forced to admit”. This is not to be confused with mistranslations out of ignorance or carelessness, which are quite common too.

Fact checking is a critical issue in journalism: Apparently, Der Spiegel has one of the largest fact-checking departments around and prides it self on its attention to detail. That it did not do its job well enough is quite clear, and this has been a source of criticism. However, I might be willing to overlook this instance—the main purpose of “internal”* fact-checking is to discover errors made by honest authors, e.g. through sloppy work, memory errors, or similar. Indeed, some amount of fact-checking is needed even by the author, himself. Detecting whole-sale invention or large-scale deliberate manipulation is a secondary purpose, potentially a lot harder to do, and there were no obvious signs that a greater-than-usual diligence was needed here**. When we look at the overall situation, however, it is quite dire: The lack of fact-checking, insight, and critical thinking displayed again and again, in article after article, is horrifying. A reasonable famous example is the 1990s reports of women overtaking men in distance running, which I dealt with in parts of an older text on simplistic reasoning. Or consider the time when I encountered an FAZ*** article speaking of the age (!) of the universe in light(!)years. Or consider the many, many variations of the long debunked 77 cents on the dollar fraud, which simply does not hold up to critical thinking. Or how about my discussion in [9]? This is a massive problem in the world of journalism.

*E.g. by a magazine with regard to its authors, as opposed to by a magazine with regard to politicians.

**In contrast, with Hitler’s diaries such diligence was quite obviously needed, and there we have a true fact-checking scandal.

***The most prestigious daily paper in Germany.

Indeed, the disputable attitude towards fact checking, critical thinking, etc. is displayed by two quotes from [1]:

The fact-checking and research department at DER SPIEGEL is the journalist’s natural enemy

A sound attitude would be the exact opposite: It is the (competent and professional) journalist’s best friend.

You [the editor] are more interested in evaluating the story based on criteria such as craftsmanship, dramaturgy and harmonious linguistic images than on whether it’s actually true.

WTF!!! I am at loss for words to express how idiotic, how mindlessly unprofessional, how fraudulent this attitude is. To boot, claims like “dramaturgy and harmonious linguistic images” bring us to another problem with journalism:

The focus on entertainment over information. The purpose of journalism is to bring information to the people—not entertainment and certainly not fake news. If I want to be entertained by something not true, there is always “Harry Potter”. A journalist (ditto, m.m., a news-paper or magazine) who forgets this is not worthy of his job.

Worse, this attitude usually leads to horrendously poor writing, as exemplified by several of the quotes of Claas Relotius articles that I encountered: this is supposed to be award-winning journalism?!? This cheesy, uninformative, emotionally manipulative nonsense!?!?

To get a better impression, I tried to read one of his works, specifically the infamous El Paso text/“Jaegers Grenze”* (co-authored by Moreno) that brought on the revelations. I started skimming after about a quarter and stopped reading entirely about half-way through: as far as journalism goes, it is horrible, even the fraud aspect aside. It is uninformative, speculative, jumps randomly from sub-topic to sub-topic, lacks a clear purpose, is filled with uninteresting trivia, and has a style of writing more suitable for a pure work of fiction—but it fails to reach the level of good fiction. This is the type of writing that makes me loathe reading the works of journalists—even were every word true, it would be a poor read. Still, Relotius won award after award… These awards might show an even greater problem than Relotius cheating: an anti-journalistic, pro-entertainment, and reader-despising attitude obviously present in journalism in general.

*In German. Beware that a warning note by Der Spiegel states that the text remains published until an internal commission has finished its investigation, with the implication that it might be removed afterwards.

Indeed, many of the articles on the scandal are themselves proof of poor journalism and writing, e.g. an apology piece on Fergus Falls, where there is an undue* amount of first-person perspective, irrelevant detail, and misguided and amateurish “human interest” angles, as e.g. with** “He laughs when I ask him if he’s angry. We’re eating pizza at a restaurant on Union Avenue that belongs to the mayor. “I first thought the article was a piece of satire,” says Becker. “I don’t feel offended at all.” He says he thought the writer was friendly – and he still does today. A nice guy. Becker says he’s worried about him.”—further proof that the typical journalist is best kept away from journalism.

*Not all first-person perspective is undue, e.g. because a certain text deals with or draws on personal experiences, attempts to differ between fact and own opinion, tries to give the author’s take of an issue, … This is the case with many of my own texts (and this sentence is it self an example of a valid use) and the quotes of what Becker said above are examples of legitimate uses, because his side of the story is the topic. However, this is only rarely relevant to journalism, which should strive to be as disconnected from the author as possible (for instance, if the journalist had made Becker’s statements, they would have been out of place). Moreover, very many journalist uses miss the point entirely, amounting to irrelevant nonsense—as e.g. with the above “We’re eating pizza at a restaurant on Union Avenue that belongs to the mayor.”, which is pointless “human interest” blurb for the dumbest of the readers.

**The quoted text in original used “type writer” quotes around the statements by Becker. If they appear as “fancy” quotes, WordPress has distorted them.

If we look at the tendency of the fakery by Relotius, there are some that could be seen as potentially distortive Leftist propaganda including “Touchdown” (a piece on Kaepernick), “Jaegers Grenze” (bigoted White men vs. a Honduran woman), and “In einer kleinen Stadt” (people in Fergus Falls dislike Mexicans). Looking at the overall list from [3], I am inclined give the benefit of a doubt and assume that he is mostly looking for sob stories, “human interest” stories, and similar; however, it is noteworthy that journalists (in at least Germany and Sweden) tend to be Left-leaning and often slant their reporting accordingly. This includes Der Spiegel and, to a very high degree, its “Spiegel Online” (“SPON”) sister, where the mixture of low quality and ever-recurring Leftist thought eventually drove me away.*

*The last straw was an opinion piece calling for journalists to be activists, to throw away objectivity, and to fulfill their “democratic role” (“demokratische Aufgabe”) by telling people what to think—and it does so while painting an incorrect picture of the scope and (partially) character of immigration resistance, alleging Right-wing hatred while ignoring the larger problem on the Left, over-looking the already strong Leftist media-bias, etc. This is exactly what a good journalist must not do, and the fact that too many journalists have already gone down this path is a major reason why current journalism is so useless. Indeed, that piece was strongly on my mind when I wrote my suggestions for a new press ethics (cf. [6]).

Juan Moreno, who first* saw through the fraud and pushed for investigations, is an interesting contrast, giving me some hope that the profession of journalism is not entirely beyond redemption. To boot, I can sympathize strongly with his adversities through my own experiences as someone with the ability to spot potential problems, and who often has been met with disbelief worthy of a Cassandra or even the accusation of having a hidden agenda. (Later events have usually proved me right.)

*Or was he? Possibly, others preceded him, but lacked the integrity, courage, and/or persistence to achieve his results… [3] points to some known suspicions as early as 2017…

Excursion on my own experiences with the press:
I can give two pertinent examples relating to myself in the press (both from my youth in Sweden; both relating to “Bergslagsposten”, the small local paper):

Firstly, and fairly harmlessly, I was one of several library visitors polled by a journalist concerning our reading preferences. My answers to several questions were pieced together and presented as a much more fluent version—appearing to be a direct quote. (An interesting, but off-topic, parallel is found in a recent text on the German police.)

Secondly, I wrote several “letters to the editor”—that all invariably were mangled in various ways, e.g. through introducing spelling errors not present in the original or leaving out words. Some excuse might have been found in my poor hand-writing, but this continued even when I switched to typing.* I remember particularly well how repeated uses of “ideologi” (“ideology”) or one of its variations were changed to (likely) “idologi” through-out one text. The frequency of such problems was so large that I am not willing to apply Hanlon’s Razor. Instead, I must conclude that deliberate manipulations to my disadvantage took place. (While I cannot say for certain by whom, I strongly suspected a junior-staff member of the paper who was also a member of the semi-rabid youth organization of the Communist Party—seeing that my letters were often critical of the strongly Left-leaning Swedish society and that I was a known member of a libertarian and neo-liberal youth organization.)

*Note that this was at a time when computers and printers were much rarer than today.

Disclaimer: I wrote most of the above a few weeks ago. I have not verified that the various links contain the same contents at the time of publication as they did at the time of writing.

Written by michaeleriksson

June 13, 2019 at 8:22 pm

Problems with YouTube content

leave a comment »

Spending some time on YouTube, I find a lot of annoyances. Spending some time looking through my drafts, I find that I had already started to write something on the topic. The below is a slightly polished version of the draft, with the reservation that I do not always remember the exact context of some complaints. The footnotes were all added during polishing, in lieu of editing the main text. The formulation as “do not” was almost certainly an error, but I am not keen on a re-write.

There is a lot of crap on YouTube, which is neither surprising, nor necessarily a problem. What is problematic: Even when the content is good, the presentation is often very poor—and in some cases showing an immense contempt for the viewers. Sadly, the more “professional” the poster or channel tries to be, the worse it tends to perform in these regards. In many ways, it is as if they have taken the worst sins of incompetent TV productions and raised them to virtues.

YouTubers (and TV producers!), please do not:

  1. Waste the viewer’s time with long intro sequences without content. There are plenty of five minute videos that start with a thirty second intro, with nothing but logos or generic information about the poster…* This is the worse when viewing several videos by the same poster one after the other.

    *In a parallel, where movies of old might have started with a brief clip for the studio, e.g. MGM’s roaring/yawning lion, many modern movies have half-a-dozen such clips for various entities, which can postpone the start of the actual movie for minutes. Result? I am annoyed and skip forward…

  2. Add background music for no good reason—but if you still do, pick something of quality and with a bit of variation. Save us from those endless repetitions of the same ten seconds of unimaginative drum beat or synthesizer cords.* Either the video has dialog and background noise that is of interest and then there should be no music at all; or not and then I would much prefer to listen to the music of my choice. Half the time, I end up having the video on mute…**

    *I have the impression that there is some repository of fairly second-rate free-for-use music provided by YouTube it self, and that many posters just pick something from this repository based on the first hearing sounding “cool”. After five minutes of repetition, it is a different story altogether. Note that this can apply to even far higher quality music: I recall being driven up the wall by the DVD “extras” for “Pirates of the Caribbean”, which all played the same portion of the movie score over-and-over-and-over-again.

    **Here I probably had my eyes on videos that relied mostly on the actual video part, e.g. wild-life scenes, pets doing weird things, or “fails”. The claim does not apply to more talk-centric videos, e.g. skits or discussions of training tips. (If in doubt, because they are less likely to be infested with poor music…) More generally, the original text is often a bit indiscriminate when it comes to type of video.

    Bad music is worse than no music!

  3. Prioritize the contents lower than the moderator/narrator/whatnot: The latter should only be seen and heard when they bring value to the content, not use the contents to attempt* to make themselves look good or cool. If you have the content, let the content speak; if you do not, pretending that you do just makes you look like an idiot.

    *They usually fail…

  4. Pollute the content with irrelevant animations, over-sized logos, or gaps between e.g. items on a list*: Use animations only when it helps clarify the content, not because you want to “pep up” the video or draw attention to yourself. Keep logos discreet, un-animated, and informative. Let the content flow; in particular, do not make a ten second pause between every item on a list or count-down.

    *A great many videos are of the type “Top-10 X of all times”, “20 ways to Y”, etc. These often take a break between the actual contents of the items to play a sound, show the number of the following item, say the number (“Secret tip number niiiine!”), or similar. The break is often so long as to be boring—and to raise the suspicion that the main purpose is to artificially increase the run-time of the video…

  5. Add unnecessary sounds and visual effects.
  6. Attempt to sound “cool”, excited or exciting, whatnot when speaking. Ideally, the contents should (metaphorically) speak for themselves, without weird manipulations. (The fact that they might need a literal speaker to help them is not a reason to change this.) A typical sport-reporter is a negative example.
  7. Add padding around the video to make it fit a certain format (e.g. 1600×900). By doing so, you prevent offline media players that automatically scale the image to match the display (i.e. virtually all modern players) from doing so, while bringing no benefit whatsoever to online/in-browser players. In fact, the latter can even get into problems because they have too little view space available. In effect, you make the file larger in order to deliver an inferior product…
  8. Add replays of what just happened. Users are perfectly capable of re-winding and re-playing, with or without slow-motion.* Avoid multiple replays of the same scene especially.

    *As a minor reservation, there might be rare instances where such a replay can be justified through higher picture quality. This, however, requires both that the scene benefits non-trivially from the higher quality (most do not) and that the result actually has a noticeably higher quality. The latter will often be the case when the video draws on an original source of a higher quality than its own (e.g. through a higher frame-rate, a less lossy encoding, or a higher resolution); however, will not be the case e.g. when the video and the original use the exact same format.

  9. Abuse YouTube for non-video content. If you have sound without picture, put it somewhere else—do not add artificial images (usually stills) to make it appear like video content. Ditto photos: There are plenty of services to host photos. Making a “video” out of them just to use YouTube is idiotic and user unfriendly.
  10. Pan around a still image. It is annoying and distracting, and makes it harder for those who actually want to study the image.
  11. Use the same or similar names for all own movies, or something used by others all the time. “Top-10 fails”, e.g., is a lousy name that makes it very hard to determine what one has already watched and what not. If nothing better can be found, something along the lines of “[your name]’s fail choices for 2016” at least gives the viewer a chance. Similarly, use a name that is actually compatible with the contents: “Fail”, for instance, does not mean* “generic YouTube video”—it means that someone screwed up, usually in an entertaining manner.

    *The word “mean” was not present in the draft and I am not certain that this was my original intention; however, it is the easiest correction that makes the sentence plausible.

  12. Re-hash the same fail (or other borrowed content) that ten other compilations already have. Some overlap is unavoidable, but please try to be more original and to pay attention to the competition.
  13. Insult the viewers intelligence with demands that he “like”, recommend, subscribe, … Viewers are adult enough to make up their own minds and this type of intrusive commands are more likely to turn him away than to entice him. Explicitly calling the people who do not “like” a video losers, as at least one video did, is almost guaranteed to have a negative effect. You see less subscribers than you want to? Your best bet is to increase the quality or quantity of your contents—not harass your viewers.

    As a general rule, the imperative has no place whatsoever in advertising or material of an advertising character. Most likely the effects are neutral to negative—and in as far as they are positive, this makes the use grossly unethical!

Additionally, I quote a text on naive links written in the interim:

Youtube provides many examples of making too specific assumptions. For instance, a video that asks the users to “comment below” might become misleading even through a minor Youtube redesign. Others, e.g. “please ‘like’ this video” might survive even a drastic redesign, but would still be irrelevant if moved to or viewed in another context, e.g. after a manual download.

Written by michaeleriksson

June 12, 2019 at 8:35 pm

Posted in Uncategorized

Tagged with , , , ,

Some observations after reading up on literary theory

leave a comment »

During some further renovations in my building, I have made two prolonged visits to the city library. Specifically, I have downed about a third of “Literaturtheorie”* by Oliver Jahraus. While some of the contents are very interesting, my overall impression is not that favorable and I (a) see my less than stellar impression of the non-natural sciences re-inforced and (b) have gained a somewhat better understanding of what is going wrong in the academic system.

*Unsurprisingly, “Literary Theory”. I do not know whether an English translation of the actual work exists, or whether any such translation kept a literal version of the title.

Because I do not have a copy at home (and because I read with an intent on learning something about literary theory—not to write a non-literary critique), I must be a bit on the vague side. However:

  1. The text is filled with a type of specious, “non sequitur”-y reasoning that I have repeatedly observed in softer fields (and in e.g. some types of political and religious propaganda): Premises are stated that are not necessarily convincing and/or obviously represent personal opinion and/or only cover a particular perspective; based on these premises, one or several (il)logical jumps are made to reach some type of conclusion; this conclusion is fed into another series of (il)logical jumps; and at the end a thesis is stated as if proved beyond reasonable doubt. To boot, this often involves disputable use of different-concepts-represented-by-the-same-word.* As usual, I have the impression that the respective author has a particular opinion, be it well-founded or not, knows that he lacks strong arguments, and tries to create a chain of somewhat plausible sounding arguments that will give the impression that he has proved his opinion—while in reality the argumentation borders on the nonsensical. Indeed, this type of argumentation is often so weak that it becomes impossible to attack, because there are more holes than substance—launching a counter-argument would be like punching fog.

    *Similar to jokes in the manner of “zero is smaller than one; zero is nothing; ergo, nothing is smaller than one; ergo, minus one is not smaller than one”. (But intended to be taken at face value and more subtle.)

    I do have a suspicion that there is a strong element of “The Emperor’s New Clothes” involved—that many nod in agreement in order to not seem stupid, believing there to be considerable substance in such texts that they simply are unable to see. In reality, the emperor is as naked as he seems.

    My earlier text on “Der Untergang des Abendlandes” mentions some similar problems. I also point to the Sokal hoax.

  2. One of the core ideas of the book seems to be that literary theory is mainly an attempt to answer the question “What is literature?”, which would raise some serious concerns as to whether it is worth bothering with as an academic field. Certainly, the question is a worthy one, and an analogous question is often asked in other fields; however, this question is typically just the first step, something answered to e.g. limit research to a sufficiently well defined or sufficiently small topic, or to ensure that various parties speak of the same thing. If it is allowed to be the dominant question of the entire field, then the field amounts to navel-gazing and self-referential orgies.
  3. At the same time, paradoxically*, he appears to see literary theory (and/or literary science, in general) as the epitome of scientific development, and seems to want to raise it to a model for other fields, including the natural sciences… In this, he deals more with a philosophy of science than with literary science. Not only is this nonsensical and presumptuous—it also amounts to turning a flaw into a virtue…

    *Thinking back, quite a lot of his claims are paradoxical, e.g. on the pattern “X is strong because of X having a weakness”.

    Moreover, the reasoning used was largely based on characteristics of softer fields, which makes a generalization to harder fields inappropriate. This point can be quite important in the larger picture, e.g. with an eye on post-modernism and its often outright misological take on science: What if this is largely simply a matter of inappropriate generalization, possibly through a lack of an understanding of the harder sciences? Notably, the more specific references made to the harder sciences were usually faulty or misleading, including a misrepresentation of Heisenberg’s uncertainty principle.*

    *I do not remember the details, unfortunately, but it might have been a claim about observation of X changing the value of Y, which is not what the uncertainty principle is usually taken to imply. (Which is rather that a more precise determination of the value of X makes the determination and/or value of Y less precise.)

  4. A specific point that annoyed me was a lengthy discussion of “Theorien” (“theories”), where various conclusions were drawn that fall apart on his failure to separate between the concepts of model, theory, and hypothesis, randomly mixing aspects of each under what he referred to as “Theorien”. (I admit that the borders between the three can be both hard to determine and a matter of dispute, but mixing them in a blanket manner is going too far.)
  5. The language pushes the border of the acceptable, leaving me with the impression of someone trying to “sound smart” (not at all unusual in the softer fields). This includes odd choices of words, e.g. the Latin or English loan “evozieren” to imply “evoke”, where standard German would normally call for the more Germanic “hervorrufen”. (As in, hypothetically, “the text evoked strong feelings” and “der Text hat starke Gefühle hervorgerufen” vs. “der Text hat starke Gefühle evoziert”.) It also includes those pointless and pseudo-intellectual hyphenated constructs that are so common in e.g. texts on art or Marxism (see excursion). While the overall sentences used are nowhere near as bad as Spengler’s (cf. link above), there is some similarity e.g. in undue jumps within a sentence and undue complexity (even by my standards); he also tends to throw in words in a manner that can make the one word correctly parsable only when the reader is five words past it (somewhat in the style of a “garden-path sentence”).

If* this type of understanding of the sciences, lines of reasoning, lack of stringency, whatnot, is typical for the softer sciences, we might as well give up on them…

*Chances are that the “if” holds—this is not the first time I have made a similar experience.

Excursion on pseudo-intellectual hyphenated constructs:

Remarks: (1) I am a little uncertain whether these are common in English, but I have often seen them in both Swedish and German. Should they be uncommon, consider combinations like “abstrakt-biomorphe*” (“abstract-biomorph[ic]”?) and “zynisch-satirisch” (“cynical-satirical”?). (Both are taken from a German art catalog.) Note, in contrast, more legitimate examples like “manic-depressive” and “Marxist-Leninist”, where the introduction of a single word is highly sensible, the word is accepted domain terminology, and the word has spread into the general vocabulary. (2) Here, I have used a plain hyphen (“-”), consistent with most of the examples that I have seen. However, an n-dash (“–”) does seem more natural to me in many or most cases. (3) Note that the issue is not one of hyphenation, per se, but of a particular way of merging two (usually) modifiers to form a new unity, despite not naturally belong together (or having connection better expressed in a more conventional manner). In contrast, e.g., my above “different-concepts-represented-by-the-same-word” does not serve to introduce a new and “smart sounding” word but to make clear that these words are tightly bound together, in order to make parsing easier for the reader.

*The use of “biomorph” leaves me skeptical for other reasons, including the low understandability and the failure to use something more naturally German. Going by the components of the word, it likely means something shaped like something living, but that is very vague and almost necessitates the application to something which was not living to begin with (or the “biomorphy” would not be worth mentioning). However, it is possible that the meaning is detectable through context (I have not studied the catalog in detail) or that this is an established word within the art world.

These have puzzled me since my first encounter, almost certainly more than thirty years ago. At that time, I thought they were some type of domain specific terminology with precise technical meanings*—today, I lean towards expressions created to sound smart or a (typically highly misguided) stylistic means of expressing something. For instance, “cynical-satirical” is unlikely to have an established wider meaning, and likely expresses the same thing as “cynical and satirical” or**, on the outside, “satirical in a cynical manner”. With “abstract-biomorph”, I am even puzzled whether this would express something different than “abstract biomorph” (note space), because the most reasonable interpretation is something that is biomorph in an abstract manner (but possibly it is intended to signify something that is simultaneously abstract art and biomorph). In some cases, the construct appears to be just a means to contract two separate or semi-separate thoughts into one word, as with the hypothetical*** “I typed a text while drinking some water” vs. “I drinkingly-typingly produced a text”.

*Note that this is the case with e.g. “manic-depressive”.

**The introduction of an unnecessary ambiguity is a good reason to avoid such constructs. But for that, I might have given the specific special-case of “cynical-satirical” a pass for convenience, and I might very well have used a “cynical/satirical” myself (note the use of a slash, not a hyphen, which avoids the ambiguity).

***I did not find a specific real example on short notice.

Such manipulatively-confounding writings amusingly-annoyingly strike me as tauro-fecal.

Excursion on other visitors, group-study, etc.:
During my first visit, most other visitors (in the area where I read) appeared to be college age and actually appeared to study (and to do so individually). During the second, they seemed a few years younger and spent more time talking, giggling, and even (playfully) hitting each other. While some of the talking did revolve around some school topic (judging by the two sitting nearest to me), it is clear that these sessions were nowhere near as productive as they could have been. This matches my own experiences* well: Group-study is usually unproductive for good heads, nowhere near as helpful for poor heads as educators claim,** and tend to follow a tempo determined by the most bored and/or unfocused individual.*** To boot, these people disturb the more serious visitors.

*Which are limited through this very observation: I turned down requests for group-study as a matter of course, once beyond the age when they could be forced upon me by teachers.

**Because the poor heads would learn from the better heads, which is rarely the case: Having things explained brings less than understanding them on one’s own, and with group study the emphasis is shifted in the wrong direction.

***Similar claims often apply to group-work as well, often deteriorating into one or two persons doing most of both work and thinking, while the rest mostly free-load or even act to the detriment of the project.

Excursion on continued reading:
I have not yet made up my mind on whether to continue with this specific book, should I seek refuge in the library again. On the one hand, my overall impression is of a relatively poor return on the invested time; on the other, the parts that are likely to be most useful to me are still left. (With an eye on my attempts to be an author of fiction, my superficial formal knowledge of literary science, theories, criticism, …, is a potential weakness—albeit not one that is of critical importance.)

Written by michaeleriksson

June 12, 2019 at 1:17 am

Tennis, numbers, and reasoning: Part II

leave a comment »

To continue the previous part:

There are a lot of debates on who is the GOAT—the Greatest Of All Time. While I will not try to settle that question,* I am greatly troubled by the many unsound arguments proposed, including an obsession with Grand-Slam tournaments (“majors”) won. This includes making claims like “20 > 17 > 15” (implying that Federer is greater than Nadal, who in turns is greater than Djokovic, based solely on their counts at the time of writing) and actually painting Serena** Williams (!) as the “she-GOAT”. The latter points to an additional problem, as might the original great acclaims for Sampras, namely a tendency to value “local heroes” more highly than foreigners.***

*But I state for the record that I would currently order the “Big Three” Federer > Djokovic > Nadal (for a motivation, see parts of the below); probably have Djokovic > Sampras > Nadal; and express great doubts about any GOAT discussion that ignores the likes of Borg, Laver, Gonzales, Tilden. I would also have at least Graf > Serena (see excursion), Court > Serena, Navratilova > Serena.

**To avoid confusion with her sister Venus (another highly successful tennis player), I will stick with “Serena” in the rest of this text.

***Relative the country of the evaluator and not limited to the U.S. The U.S. is particularly relevant, however, for the dual reason that authorship of English-language articles, forum posts, whatnot comes from U.S. citizens disproportionately often (measured against the world population) and that U.S. ideas have a considerable secondary influence on other countries.

The fragility of majors won is obvious e.g. from comparing Borg and Sampras. Looking at the Wikipedia entries for “career statistics” (especially, the heading “Singles performance timeline”) for Borg and Sampras, we can e.g. see that Borg won 11 majors by age 25, while largely ignoring the Australian Open, and then pretty much retired*; while Sampras was at roughly** 8 at this age and only reached his eventual 14 some six years later. To use Sampras’ 14 majors as the sole argument for him being greater is misleading, because Borg might very well have won another 3 merely by participating in the Australian Open—or by prolonging his serious career for a few years more.***

*His formal retirement situation is a little vague, especially with at least one failed come back, but it is clear that he deliberately scaled back very considerably at this point.

**I have not checked exact time of birth vs. time of this-or-that tournament, because it is very secondary to my overall point. The same might apply to some other points in this text.

***There are, obviously, no guarantees. For instance, as it is claimed that Borg suffered from a burn-out, he might not have been able to perform as well for those “few years more” (and/or needed a year off to get his motivation back) and playing the Australian Open might have brought on the burn-out at an earlier stage. Then again, what if the burn-out had been postponed by someone telling Borg that “your status among the all-time greats will be determined by whether you have more or less than 14 majors”…

More generally, the Australian Open was considerably less prestigious than the other majors until at least the 1980s, and many others, e.g. Jimmy Connors, often chose to skip it. The 1970s saw other problems, including various boycotts and bans (Connors, e.g., missed a number of French Opens).

Before 1968, the beginning of the “open era”, we have other problems, including the split into amateur and professional tennis, which (a) led to many of the leading pros having lesser counts than they could have had (Gonzales 2!!!), (b) softened the field for the amateurs, leaving some (most notably Emerson) with a likely exaggerated count.

On the other end, we have to look at questions like length of career vs. number of majors, with an eye on why a certain length of career was reached. Federer, for instance, has reached considerable success at an age that would have been considered almost absurd in the mid-1980s, when I first watched tennis—players were considered over the hill at twenty-five and teens like Wilander, Becker, Chang were serious threats.* Is this difference because Federer is that much of a greater player, or is the reason to be found in e.g. better medicine or different circumstances of some other type? Without at least some attempt at answering that question, a comparison of e.g. Wilander and Nadal would be flawed**: Both won three majors in their respective best years (1988, 2010) around age 24. Wilander never won another and ended with 7; Nadal was a bit ahead at 9 already, but has since added another 8***!

*Interestingly, I do recall that there was some puzzlement as to why tennis was suddenly dominated by people so young, when it used to be an “old” man’s sport. Today, we have the opposite situation.

**From a “methodological” point of view. It is not a given that the eventual conclusion would be different, because it is possible to be right for the wrong reason. (Certainly, in this specific constellation, the question is not so much whether Wilander trails Nadal, as by what distance. Is 17–7 a fair quantification or would e.g. 17–13 be closer to the truth?)

***This is written shortly before the 2019 French Open final, which might see yet another added. If so, fully half (and counting…) of his tally came after the age when Wilander dropped out of sight.

Or how about the claimed “surface homogenization”, i.e. that the different surfaces (grass/hard court/clay) play more similarly to each other than in e.g. the 1990s? Is it possible that the Big Three would have been less able to rack up major* wins, with more diverse surfaces? Vice versa, should some of the tallies of old be discounted for being played on fewer surfaces? (Notably, grass was once clearly dominant.)

*Looking past the majors, we can also note the almost complete disappearance of carpet.

Then there is the question of competition faced. For instance, with an eye on the dominance of the Big Three, is Wilander–Nadal a reasonable comparison, or would e.g. Wilander–Murray or Wilander-Wawrinka be more reasonable? Who is to say that Wilander would have got past 3 majors or that Murray/Wawrinka would have been stuck at 3, had their respective competition been switched? What if the removal of just one of the Big Three had given the remaining two another five majors each? (While the removal of some past great would have given his main competitors two each?) The unknowns and the guesswork needed make the comparison next to impossible when two players were not contemporaries.

For that matter, below a certain number of majors won, the sheer involvement of chance makes the measure useless. Comparing Federer and Sampras might be somewhat justified, because they both have a sufficiently large number of wins that the effects of good and bad luck are somewhat neutralized (“you win some; you lose some”)—but why should Johansson (1 major) be considered greater than Rios (none)? (Note that Rios was briefly ranked number one, while Johansson was never even close to that achievement.) How many seriously consider Wawrinka the equal of Murray (both at 3)?

Many other measures are similarly flawed. So what if Nadal has more “masters” wins than Connors? Today, these tournaments are quasi-mandatory for the top players, while they were optional or even non-existent during Connors’ career. Many of the top players of the past simply had no reason (or opportunity) to play them sufficiently often to rack up a number that is competitive by today’s standards. (But, as a counter-point, those who did play them might have had an easier time than current players due to lesser competition.)

Tournament wins (in general) will tend to favor the players of the past unduly, because many tournaments were smaller and (so I am told) the less physical tennis of yore made it possible to play more often—and not having to compete in e.g. the masters allowed top players to gobble up easy wins in weaker competition.

Looking at single measures, I would consider world ranking the least weak, especially weeks at number one. (But I reject the arbitrary “year end” count as too dependent on luck and not comparable to e.g. winning a Formula One season or to the number-one-of-the-year designations preceding the weekly rankings.) However, even this measure is not perfect. For instance, Nadal trails Lendl in weeks at number one, but has a clear advantage in terms of weeks on number two—usually (always?) behind Federer or Djokovic. Should Lendl truly be given the nod? Borg often trailed Connors in the (computerized) world ranking while being considered the true number one by many experts; similarly, many saw Federer as the true number one over Nadal for stretches of 2017 and 2018 when Nadal was officially ahead. Go back sufficiently long (1973?) and there was no weekly ranking at all.

The best way to proceed is almost certainly to try to make a judgment over an aggregate of many different measures, including majors won, ranking achievements, perceived dominance, length of career, … (And, yes, the task is near impossible.) For instance, look at the Wikipedia page on open era records in men’s singles* and note how often Federer appears, how often he is the number one of a list, how often he is one of the top few, and how rarely his name does not appear in a significant list. That is a much stronger argument for his being the GOAT than “20 majors”. Similarly, it gives a decent argument for the Big Three being the top three of the open era; similarly, it explains** why I would tend to view Djokovic as ahead of Nadal, and why I see it as more likely that Djokovic overtakes Federer than that Nadal does (in my estimate, not necessarily in e.g. the “has more majors” sense).

*A page with all-time records is available. While it has the advantage of including older generations, the great time spans and changing circumstances make comparisons less reasonable.

**Another reason is Nadal’s relative lack of success outside of clay. He might well be the “clay-GOAT”, but he is not in the same league as some others when we look at other surfaces and he sinks back when we look at a “best major removed” comparison. For instance, if we subtract his French-Open victories, he “only” has 6 majors, while Federer (sans Wimbledon) still has 12 (!), Djokovic (sans Australian Open) has 8, and Sampras (sans Wimbledon) has 7.

Notes on sources:
For the above, I have drawn on (at least) two other Wikipedia pages, namely [1] and [2]. Note that the exact contents on Wikipedia, including page structure, can change over time, independent of future results. (That future results, e.g. a handful of major wins by Nadal, can make exact examples outdated is a given.)

Excursion on Serena vs. Graf:
Two common comparisons is Federer vs. Sampras and the roughly respective contemporaries Serena vs. Graf. If Federer is ahead of Sampras, then surely Serena is ahead of Graf? Hell no!

Firstly, if we look just at majors won (which is the typical criterion), we find that Graf hit 22 majors at age 29* and retired the same year, while Serena had 13 at a comparable age, hit 22 at age 34/35 and only reached her current (and final?) tally of 23 a year later. By all means, Serena’s longevity is to be praised, but pulling ahead by just one major over such a long time is not impressive. Had Graf taken a year off and returned, she would be very likely to have moved beyond both 22 and 23. In contrast, Federer reached (and exceeded) Sampras tally at a younger age than Sampras—and then used his longevity to extend his advantage.

*Not to mention 21 several years earlier, after which she had a few injury years.

Secondly, most other measures on the women’s open era records page put Graf ahead of Serena, including weeks at number one. This the more so, when we discount those measures where Serena’s longer career has allowed her to catch up with or only barely pass Graf.

Excursion on GOAT-but-one, GOAT-but-two, etc.:
While determining the GOAT is very hard, the situation might be even worse for the second (third, fourth, …) best of all times. A partial solution that I have played with is to determine the number one, remove his results from record (leading to e.g. a new set of winners), re-determining the number one in this alternate world, declare him the overall number two, remove his results from the record, etc. For instance, Carl Lewis is the long-jump GOAT by a near unanimous estimate, but how does e.g. Mike Powell (arguably the number two of the Lewis era) compare to greats like Jesse Owens and Ralph Boston? Bump everyone who lost to Lewis in a competition by one spot in that competition, re-make the yearly rankings without Lewis, etc., and now re-compare. While I have not performed this in detail, a reasonable case could now be made for Mike Powell as the number two of all time.

Unfortunately, this is trickier in tennis than in e.g. the long jump, because of the “duel” character of the former. For instance, if were to call Federer the GOAT and tried to bump individual players in a certain tournament won by him, would it really be fair to give the runner-up the first place? How do we now that the guy whom Federer beat in the semi-final would not have won the final? Etc. (A similar problem can occur in the long jump, e.g. in that someone who was knocked out during the U.S. Olympic trials in real life, might have done better than those who actually went, after the alternate-reality removal of a certain athlete. The problem is considerably smaller, however.)

Written by michaeleriksson

June 9, 2019 at 12:17 am

Posted in Uncategorized

Tagged with , , , ,

A German’s home is not his castle / a few issues around inspections and meter readings

leave a comment »

One of the great annoyances with living in Germany is the one, two, or more* service companies that invariably demand entry to one’s apartment every year—after having made a one-sided declaration of date and time, and usually with a comparatively short** advance warning. Moreover, this is usually done through simply posting a notice on the door of the building (often on the outside), with the implications that (a) people who are not currently present, including those who live elsewhere*** and those currently on vacation, might not have the ability to react in time, (b) the notice can be removed by another party, including playing children. Of course, this type of announcement could easily be done by a fraudulent entity who just wants access to the apartments.

*I have three myself, and it might have been four or five had not the gas and electricity meters been outside the apartment… These are two to respectively inspect the smoke detectors and the exhaust/chimney for the gas heater, and a third to read the water meter. (An earlier text might have claimed that the chimney inspection took place once every three years. This was an early misunderstanding on my part.)

**I have not paid great attention, but a rough guesstimate would be ten days for a typical notice. I have seen less than a week on at least some occasion.

***For instance, those who try to rent out an apartment and who currently do not have a tenant; for instance, those (like me, in the past) who spend months at an end living elsewhere due to work.

True, missing the date is not the end of the world, because these companies are obliged to provide alternative dates upon request. However, this is usually not handled well. For instance, many notices fail to inform about the right to request a different date, and contact information is usually limited to telephone* only. The chimney-sweep, whose recent notice is the trigger for this text, does have an email address, but fails to mention it. The notice does mention the possibility of requesting an alternate date, but it does so in such a different font size and color (compared to the rest of the text) that I actually did not recognize it before a closer inspection.** Moreover, it speaks of a “rechtzeitig” (roughly, “timely”) contact, which is very vague and in most circumstance would be taken to imply that the contact must take place before the scheduled date (which is not the case and would be unconscionable for the absent). The smoke-detector service, on the other hand, appears to have no interest in actually going through with replacement dates,*** implying that my smoke detectors have not been serviced since before I bought the apartment, because the previous owner apparently also had problems with it. A similar issue is present with some other apartments in my building.

*Which, combined with typical office hours, can be inconvenient for those who work during the day, highly troublesome for those who work during the night, and a severe obstacle for the deaf and mute.

**But, unlike many others, I was already well aware of my right.

***Presumably, either to avoid the extra cost of a second visit or to push the delay to the point that there is a pseudo-justification to request a billable visit. (By regulation, at least a first replacement date must not come with an extra charge to the apartment residents.)

Now, the chimney inspector was open to providing a new date, but this too was fraught with complications. On the one hand, no dates were available before July 12th (still more than a month ahead). My suggestions of the 19th and the 26th, picked to have a greater time flexibility than the 12th, were rejected due to “betriebsferien” (“company holidays”) between July 15th and August 1st… Moreover, the possible hours were restricted independent of date, including a 3 PM upper limit Monday through Thursday and 2 (!) PM on Fridays. Effectively, to get it done after work is not possible without infringing severely on typical working hours—not just leaving an hour or so earlier than the colleagues. While “before work” is a little easier and might work for most local workers (but not for all and not for many commuters), the end effect is that a portion of the regular work day must be sacrificed. (That Saturday and Sunday are out entirely is hardly worth mentioning in Germany.) This continues an idiocy already discussed for delivery services—a failure to adapt to the needs of the service recipients in favor of a strict adherence to “traditional” working hours, even when the result is more work for the service provider. Indeed, here the working* hours are even a sub-set of the normal working hours, making it even harder. As elsewhere, an outdated world-view (or resulting “legacy procedures”) might have survived through the implicit assumption that every apartment comes with a house-wife.

*The word “working” might be misleading, because the individual employees might have other tasks to perform at other times. The end effect on the residents is the same, however.

Even in those cases, however, when everything works as planned, these notifications are problematic through giving intervals of hours,* often in the middle of the day. For instance, the gas-inspection notice gives 9–11 AM, which implies that even someone who works locally might be forced to take half-a-day off from work—and, when working in Cologne, I would have been forced to take so much time off that I likely would have skipped work altogether.

*Which, obviously, do not state how long the individual visit will take. Instead, it is an understandable matter of “we could come at any time during this interval”, with an eye on questions like how long the visits to other apartments, or even apartment houses, take. The long intervals make this issue worse than the similar problem discussed a paragraph earlier.

Looking at possible solutions, at least some of this will likely take care of it self over time, through the spread of new technology*. However, improvements here and now still make sense. For instance, how about requiring a considerably longer interval for notification, e.g. that notices must be published at least one month in advance?** How about a requirement that notifications are also given per e.g. email (to those who have registered in some manner)? How about more reasonable hours and/or days of visit? Or how about my personal pet idea: Have each city (or some other unit) coordinate two*** fix, known-to-all, and non-adjacent days a year, for some sub-area. On these, the residents within the sub-area are required to give access to (legitimate) service providers; on others, they must not be bothered****. Notably, this would bring great benefits even to the service providers, because they could cut the costs for repeat visits and most of their own efforts to coordinate with absent residents—or actually charge for them from day one. This scheme would, obviously, require a considerable first effort of coordination, but later adjustments are likely to be small for a typical year.

*Notably, meters that can be read electronically without entering an apartment. However, like e.g. my own current outside-the-apartment gas and electricity meters, this comes with an increased risk of leak of data to unauthorized third parties.

**Note that anything less than two weeks is inherently problematic due to the larger risk that e.g. a vacation absence prevents the residents from being informed on time. In contrast, a full month would make it a near certainty that the notice is present in time for the residents to react. Moreover, the longer interval makes it easier to arrange for e.g. a work absence.

***Using two, instead of one, allows for a greater flexibility, e.g. to compensate for a strike or to make life easier on service providers with unfortunate day collisions for serviced sub-areas; however, each service provider would be expected to only use one of the two (per apartment and/or sub-area), just like it is one day a year today. Note that reserving two days a year will not increase the effort for the average resident, because the two days are the same for all service providers (but it will allow for far better planning).

****Among these annual (or otherwise recurring) activities: when we move to more ad-hoc matters or something requiring a short-term response, e.g. a burst pipe, a strict adherence will not always be reasonable.

I note that as far as solutions are concerned, it is positive if a portion of the burden is passed from the residents to the service providers, because (a) the current system is constructed to the very one-sided advantage of the latter, (b) not all of these bring an advantage to the residents, notably the borderline idiotic yearly smoke-detector inspections and many chimney inspections and whatnots (also see excursion), (c) the matter of entering someone else’s home should not be trifled with. As to the latter, I would personally very much prefer never to have someone in my apartment that I have not explicitly invited (and I would not invite many to being with); other relevant concerns include the extra cleaning efforts that many, likely in particular the “neat freaks”, will feel necessary to make the apartment sufficiently presentable.

Excursion on chimney-sweeps:
The problems are increased by regulations relating to chimney-sweeps, who are responsible for some tasks in a semi-governmental role—including at least some inspections. Among the many problems is that there is one “official” chimney-sweep who has the right to perform the semi-governmental tasks in a given area: I am allowed to hire another chimney-sweep to perform various tasks—but not all tasks. Because the official chimney-sweep still needs to involved, there is a strong incentive to just stick with him through-out. To boot, it can be disputed whether the exact checks* involved in my case really should be done by a chimney-sweep at all, or not rather the gas company or a service specialist for gas-heaters.

*Strictly speaking, it appears to be more of an emissions check than a chimney check, with the chimney only playing in as far as a blocked chimney would lead to dangerously large emissions in the apartment.

I read up a fair bit my first year in the apartment, but have forgotten most of what I read by know. However, there were several web sites and/or forums dedicated to problems around the flawed system. One recurring issue (that I do remember) was skepticism towards the reasonability of inspection intervals in at least some contexts, and some inspections that were outright nonsensical, e.g. that chimneys that were not even used still needed* a yearly inspection.

*In the eyes of the local chimney-sweep. That his interpretation was even formally/legally/bureaucratically correct (let alone practical), was not always a given.

Excursion on other means to calculate costs:
The use of meters to measure consumption of e.g. heating* is laudable from a fairness perspective and might or might not give incentives to consume less energy. However, it is not the only approach possible. For instance, in Sweden, heating costs are typically included in the rent in a blanket manner, and this appears to work well. The heating costs per apartment might be higher** in Sweden, but this is offset** by the costs for reading meters. Similarly, the overall environmental impact might be greater***, but this is partially offset by e.g. the environmental impact of meter readers traveling in cars.

*One of the more common German meter-types is the per-radiator meter that attempts to track the amount of central heating used by individual apartments, to allow a corresponding division of the overall costs.

**The degree varies depending on what is measured and on details unknown to me. If only the cost for the service company is included, it is likely only a partial offset; if the lost time and extra effort for otherwise working residents are included, at least these are likely see approximately a full offset; and if we look at the overall societal cost, it is almost certainly more than an offset.

***After adjusting for the effects of a colder climate, or it would be a near given.

Excursion on use of “layers” in texts:
A very common practice in e.g. notices, advertisements, prospects, web pages, …, is to give different types of information a different “look”. This is presumably with the intention of putting information in “layers” to be read independently. In my personal experience, this works very poorly, because people (like I above) tend only see one layer at a time, which implies that the information put into a different layer through e.g. a radically different (foreground?) color runs a risk of being overlooked entirely, especially when having a poor contrast. Such layers might sometimes be helpful when the reader is aware of them in advance, e.g. when comparing the descriptions of many products that have the same layering. More often, it is likely better to not try such tricks and to rely on a simple text flow, intended to be read as a single layer. This text, in turn, might then contain changes in (background?) colors to high-light a different purpose without causing a layer division. If in doubt, just put the different layers on different pages. (Disclaimer: This excursion is unusually “spur of the moment” and might be unusually open to revisions of opinion.)

Written by michaeleriksson

June 6, 2019 at 4:19 am

Tennis, numbers, and reasoning: Part I

with one comment

Preamble: This and a following text were intended as a single, not that long, piece. Because the length of the first part grew out of hand, I decided to split the text into (at least) two parts. Beware that a mixture of time constraints and the growing-out-hand left me lazy with the math—there might be errors through lack of checking that change the details (but not the principle), and there is a lack of explanation. (However, the math is not more advanced than what many high-schoolers encounter.) Note that I use the convention of ^ to indicate exponentiation, e.g. 2^3 = 2 * 2 * 2 = 8, and that “*” might be displayed oddly for technical reasons. (I normally use it only to indicate footnotes, and have not bothered to implement e.g. a math mode in my markup.)

With the latest French Open reaching its deciding phase, I have been reading a bit about tennis. A few resulting observations on tennis, numbers, and reasoning:

(Part I)

There is very little understanding of how probabilities play in when it comes to e.g. who-beats-whom, what is and is not impressive, whatnot. Notably, even many hard-core fans seem to jump to odd conclusions about superiority, inferiority, or who is too past his prime to be reckoned with based on a single* match. This is highly naive, even when we discount questions like surface preferences, off days, and whatnot.

*Note: “single”, not “singles”.

Consider a hypothetical match-up, where two players (A and B) are so close in abilities that the winner of each individual set is a 50–50 matter. Even in a best-of-five setting, this leaves player A with a one-in-eight chance of a straight set victory—and ditto player B. In other words, there is a quarter chance, that the match will be decided in only three sets and who wins is a toss up. Correspondingly, a single straight set victory does not necessarily say anything about the involved players. In a best-of-three-setting, half of the matches would be straight set victories and who wins is, again, a toss up.

What can be done is to look at “Bayesian probabilities”*, i.e. try to determine the probability of something based on observed events. Given that player A beat player B, we can suspect that his chance of winning is higher. Certainly, if the probabilities of a set win are shifted from 50–50 to 90–10, this would also normally result in player A winning, while a 10–90 shift would typically leave player B as the winner. (But note that even a 90–10 scenario can result in an upset, especially in best-of-three.) To get reliable information from such considerations, however, a fairly large data set can be needed, as in repeated meetings or a clear superiority in terms of games or points won in a single match (but not just the match it self or the sets of the match; of course, any single-match evaluation is prone to other weaknesses, like ignoring the possibility of a single “bad day”).

*Going into details would go past the high-school level and, frankly, I might need to refresh my own memory. The principle, however, is that (a) the probability of X and X-given-that-Y are not (necessarily) the same, (b) suitable choices allow us to e.g. calculate an expectation value for an unknown probability. For instance, the probability that the sum of two fair and six-sided dice exceeds seven is 5/12 a priori but 5/6 given that we already know that one of the dice came up six. For instance, if this sum exceeds seven at a different ratio than 5/12 over a great number of repetitions, we might conclude that one or both dice are not fair, and even attempt to estimate new probabilities for the individual sides of the dice. The “reasoning” used when it comes to some tennis “experts” could be seen as a highly naive misapplication of this, viz. that “A beat B; ergo, the probability of A beating B is 100 %; ergo, A will always beat B”.

As a notable example, let us look at the one official meeting between Pete Sampras and Roger Federer:

According to an archived version of official statistics, Federer and Sampras won respectively 1 and 0 matches (100–0), 3 and 2 sets (60–40), 31 and 29 games* (51.67–48.33), and 190 and 180 points (51.35–48.65).

*Including a tie-break each. Subtracting tie-breaks, we have 30 vs 28 and virtually the same percentages. Note that the set–game difference is likely increased and the game–point difference diminished through alternating service games (as opposed to e.g. alternating serve after each point).

Looking at the overall match, it tells us next to nothing. Indeed, had but one or two points gone differently, it might have been Sampras winning.* The games tells us a little more, but still nothing that could not easily be the product of chance. Only the points give us some truer indication (despite having the smallest relative difference)–but even that could be a product of chance or, e.g., some difference** in playing style or point distribution that is of little import.

*At least one example is obvious without looking at the individual development: Federer won the first set tie-break 9–7. Switch two points around and Sampras would, all other things equal, have won the match 3–1 (a somewhat clear victory to the naive eye). Switch one around and he would have had a roughly 50 % chance of winning from 8–8, and there might have been some earlier point in the tie-break, where even a single point would have handed him e.g. a 7–5.

**Consider e.g. a scenario where a player who already is a break up prefers to not fight back on his opponents serve, in order to save himself for the next set. (Whether such factors applied in this specific match, I leave unstated.)

This was a genuinely close match and even just looking at the game score, this should be obvious. (Nevertheless, I have seen this match cited as proof that Federer was better* than Sampras—notwithstanding factors like that none of them were in their primes.) Still, the margins on the point level are often fairly small and can still result in notable differences in overall results. For instance, imagine a 0.55 (i.e. 55 %) probability of winning any individual point**, and see how this scales. Winning a point is (tautologically) a 55–45 proposition and the result of a point played will tell us next to nothing (but the score over one hundred, two hundred, three hundred, …, points will be increasingly telling). If we assume that a game is played as best-of-five points,*** we now have a probability of 1 * 0.55^5 + 5 * 0.55^4 * 0.45^1 + 10 * 0.55^3 * 0.45^2 = .5931268750 or roughly 3/5 that player A wins an individual game (per the binomial formula). The difference in game-winning percentage is then almost doubled compared to the point-winning difference. If we now approximate a set as best-of-nine games****, the binomial formula gives roughly a .7189 chance of player A winning a set. Applying this to matches determined by best-of-three and best-of-five sets,***** we then have a match winning probability of roughly .8074 respectively .8610.

*This is another case of my disagreeing with the reasoning behind a claim—not necessarily the claim it self.

**Glossing over the complication that the probabilities will vary widely depending on who serves.

***This is not the case, nor is it necessarily a very realistic approximation. I considered making a more elaborate model, but deemed it too much work for a demonstration of principle. The best-of-five approximation is easy to calculate and requires no deeper modeling. To boot, it is likely to understate the difference that I try to show, which makes it more acceptable; to boot, the simplifications of ignoring serves might be the larger error, had I intended to find more exact numbers (rather than demonstrate the principle); to boot, any model of a tennis game that involves fix probabilities for all points (ignoring e.g. their relative importance, tiredness, nerves, …) is inherently simplistic. (An approximation as best-of-six might have been better, but would have involved the possibility of a draw, while best-of-seven might have overstated the difference.)

****Similar remarks apply.

*****Here the modeling is exact, because matches are played as best-of-three and best-of-five sets.

From another point of view, consider claims like “player A would not be able to take a game of player B”. Even when this applies to a typical match, it does not (or only very, very rarely) apply categorically over all matches played between them–again for statistical* reasons. Assume that player A is so much worse that he virtually never wins a point in his opponents service games and a mere 20 % of points in his own service games (making 15–60 a typical score for an own service game). This still gives him a chance of 1/5^4 or one in 625 to win any of his service games to love and .05792 or roughly 1/17 to win it at all by the above best-of-five model. This model might overstate the probability in this case, but if we say 1/30 as a rough guesstimate, and factor in that he would have at least three opportunities to serve per set, he would likely win a game roughly once every three best-of-five** or once every five best-of-three** matches. With a less disastrous difference, the odds improve correspondingly.

*Even discounting factors like player B gifting a game to be kind, player B having a sudden cramp, whatnot.

**Note that this translates to playing (three times) three resp. (five times) two sets under the assumptions made, because he would need absurd luck not to loose in straight sets.

This type of thinking demonstrates how unbelievable some of the exploits of the all-time greats are. For instance, to win forty straight matches requires an enormous superiority over the average opponent (and/or a ridiculous amount of luck). Prime Federer’s feats are mind-numbing to those who understand the implications, including e.g. ten straight Grand-Slam finals with eight victories—the full, mythical Grand Slam (i.e. all four tournaments won in the same year) is a considerably lesser accomplishment.

Excursion on other sports:
Some of the above applies equally to some or most other sports, e.g. the impressiveness of victories in a row. For instance, if an athlete or a team has a geometric average chance of 95* % of winning any individual competition (e.g. a tennis, boxing, or basket-ball match), the chance of winning ten in a row is 0.95^10 or roughly three in five, twenty in a row carries just a little more than a one in three chance, and forty in a row roughly one in eight. To have an at least 50 % chance at forty in a row, an individual probability of better than 98.28** % is required. Other parts do not apply, due to the unusual scoring (where e.g. a basket-ball game leaves the higher scorer the victor, while a tennis match might see the party with fewer points take the match).

*Note that this is a very high number, seeing that it must last for some time, is vulnerable to external conditions, must cover the risk of injury, etc. Moreover, the geometric average is more sensitive to outliers than the regular arithmetic average. For instance, playing seven opponents with an individual 99 % chance of victory and a single toss-up opponent gives a geometric average of less than 91 % but an arithmetic of 92.875 %.

**To understand how high this number is, note that it cuts the opponents chance of winning down to a little more than third of what it is for 0.95—an already very high number.

Excursion on probabilities, upsets, and the oddities of score keeping:
It might seem paradoxical that the score keeping used in tennis increases the difference in score compared to a plain point counting, e.g. as with Federer–Sampras above, while also increasing the probability of upsets. This, however, is easy to understand by considering the games and sets a division of smaller somewhat independent events into larger somewhat independent events. A reasonable analogy is a “plain” election system vs. a “first past the post” system.

This weakness to upsets is arguably a part of the charm of tennis, but it is a strong argument in favor of keeping important men’s matches at five sets and to introduce them among the women too.

Written by michaeleriksson

June 4, 2019 at 11:07 pm