Michael Eriksson's Blog

A Swede in Germany

Tor Browser missing the point

with one comment

I have written before of browser makers having the wrong attitude (recently, Pale Moon; Firefox repeatedly, e.g. [1]) and of people missing the point to such a degree that what they do borders on the pointless.

Unfortunately, the Tor Browser is another case, brought to my mind by a recent “user agent”* issue (cf. below).

*Strictly speaking, “user-agent header”. For simplicity, I will use just “user agent” below.

The Tor Browser is a modified Firefox browser that allows surfing through the anonymisation/privacy/whatnot network Tor, while attempting to remove weaknesses in Firefox that could defeat the use of Tor. On some levels, the developers take a very strict approach, e.g. in that they advice against using Tor with another browser. On others they are paradoxically negligent.

Consider the following claim from the current version of the Tor FAQ:

Why is NoScript configured to allow JavaScript by default in Tor Browser? Isn’t that unsafe?

We configure NoScript to allow JavaScript by default in Tor Browser because many websites will not work with JavaScript disabled. Most users would give up on Tor entirely if a website they want to use requires JavaScript, because they would not know how to allow a website to use JavaScript (or that enabling JavaScript might make a website work).

This, however, makes both the use of Tor with the Tor Browser and the many alterations of the Tor Browser pointless… Allowing JavaScript is not just “unsafe”—it is a complete and utter disaster, defeating the purpose of Tor entirely! Indeed, I am very, very careful about allowing JavaScript even when not using Tor, because JavaScript does not only allow a circumvention of anonymity protection (which is not a concern in a more “vanilla” situation)—it also very severely increases the risk malware infections and whatnots. (To which can be added complications like more intrusive advertising, redundant and annoying animations of other kinds, and similar.) It would be better to use Firefox (over Tor) with JavaScript off than to use the Tor Browser with JavaScript on!

The we-do-not-want-to-scare-away-beginners argument normally carries some* weight; however, here it does not, because the damage done is so massive. This is like a word-processing program that does not allow the user to enter text… I would also argue that because someone is a beginner, it is more important to give him safe defaults—I know the dangers of JavaScript; most beginners do not. These beginners might then surf away as they like, in a false sense of security, and potentially find themselves in jail after insulting the local dictator…

*But only some: To a large part, it is a fallacy, because it so often involves insisting on behavior that benefits the beginners for two days and either harms the more experienced users for years or forces them to invest considerable time in searching for settings/plugins/whatnot to make the behavior more sane. Indeed, in many cases, the result is a background behavior of which most users will not even be aware, despite being harmed by it. (Consider e.g. “accessibility services” that run up processor time, increase the attack surface for hostile entities, make the OS sluggish, …, without ever being used by the vast majority of users.)

A much better solution would be to keep JavaScript off by default and give beginners sufficient information that they can judge why things might not work and when it might or might not be a good idea to activate JavaScript.* Indeed, the nature of anonymity on the Internet is such that Tor is of little benefit unless the user has received some education on the traps and problems.

*In most cases, the answer is “never”: The security loss will always potentially be there, even a trusted website can be abused by third-parties, and most sites that require JavaScript to function properly, at some point, require a de-anonymizing log-in or registration, e.g. to complete a purchase. With the rare exceptions, I would recommend using an entirely different Tor Browser instance.

The text continues:

There’s a tradeoff here. On the one hand, we should leave JavaScript enabled by default so websites work the way users expect. On the other hand, we should disable JavaScript by default to better protect against browser vulnerabilities ( not just a theoretical concern!). But there’s a third issue: websites can easily determine whether you have allowed JavaScript for them, and if you disable JavaScript by default but then allow a few websites to run scripts (the way most people use NoScript), then your choice of whitelisted websites acts as a sort of cookie that makes you recognizable (and distinguishable), thus harming your anonymity.

Apart from understating the risks of JavaScript, this argument hinges on an easily avoidable use of NoScript. (Cf. footnote above.) This use is the normal case when using a vanilla Firefox, but it is only a convenience, it is not a good idea with the Tor Browser, and it is not acceptable to let the uninformed dictate behavior for the informed. Better then to inform them! In a pinch, it would be better to not include NoScript at all,* point to the possibility of using several browser instances (with or without JavaScript on), and let those who really, really want NoScript install it manually.

*With some reservations for secondary functionalities of NoScript, which is not just a fine-grained JavaScript on/off tool. Then again, these secondary functionalities could in some cases also help with de-anonymization through making the browser behave a little differently from others and thereby allowing some degree of finger-printing.

The same type of flawed thinking is demonstrated in a recent change to the user agent: Historically, this identifier of the browser, OS, and whatnot has had the same default for all Tor Browsers (with occasional updates as the version changed), in order to make it harder to de-anonymize and profile individual users. With the recent release of version 8.0*, this had** changed and at least the OS was leaked. The implication was that e.g. a Linux users could be pinpointed as such—and because of their smaller proportion of the overall users, their anonymity was turned into a fraction*** of what it was before.

*Based on Firefox 60.x, incorporating the extreme overhaul of Firefox hitherto kept back. I am not enthusiastic about the changes.

**The developers have recanted in face of protests—a welcome difference to the way the Firefox developers behave.

***In some sense: Consider a game of “twenty questions”, where the “questioneer” is told in advance that a mineral is searched for… Not only does such information prematurely cut the average search space in three (mineral, plant, animal resp. Linux, MacOS, Windows), but due to the smaller size of the mineral kingdom resp. set of Linux users, the specific current search space is made far smaller.

The justification for this appears to be a fear that websites would (as per the old default) hand out Windows content to Linux users, causing sites to not work. While this is not as bad as the JavaScript issue, it is bad enough, especially since this change was not clearly communicated to the users.

Again, the reasoning behind the change is also faulty: Firstly, the influence of the OS is fairly small and any site that relies on OS information is flawed. Secondly, the opposite problem is quite likely, that a website sees “Linux” and decides “I have nothing tailor-made for Linux. What if the display is not pixel perfect?!? Better to just show an error message!”, even though the site would have worked, had the Windows version been delivered. Combined, these two factors imply that the change likely did more harm than good even for functionality…

A specific argument in favor of the change was that it made little sense to spoof the user agent, because this information could still be deduced by other means. However, almost all these other means require JavaScript to be active—and no reasonable user of the Tor Browser should have JavaScript active (cf. above)! For those who, sensibly, have deactivated JavaScript, the user agent is now an entirely unnecessary leak. To boot, there are situations, notably automatic logging of HTTP-requests, that have access to the user agent, but not to other values (or only with undue additional effort). Looking at such a log, an after-the-fact evaluation can show that a Linux (and Tor Browser) user from IP X visited a certain North-Korean site at 23:02 on a certain day, while the JavaScript based evaluation has to take place in real-time or not at all. Possibly, the logs of another North-Korean site shows that a Linux user from the same IP visited that site at 23:05. It need not be the same user, but compared to a (real or spoofed) Windows user in the same constellation, the chance is much, much larger.*

*Among many other scenarios. Consider e.g. a certain page on a site which is visited by a Linux user somewhere between 23:00 and 23:30 everyday—had he been a Windows users, no one might even have noticed a pattern. Or consider a user visiting one page of a site with one IP at 23:02 and another page with another IP at 23:03—now the risk that the user is recognized as the same is that much larger. Such scenarios obviously become the more serious when other information is added from the “regular” twenty questions. (And while they might seem trivial when applied to e.g. me or the typical reader, they can be very far from trivial in more sensitive situations, e.g. that of a North-Korean fighting for democracy or of someone like Assange.)

Excursion on user agent, etc.:
The situation is the more idiotic, seeing that there are* very, very few cases where e.g. the browser or the OS of the user is of legitimate interest to the website. Apart from statistics** and similar, the main use is to deliver different contents, which is just a sign that the web developers are incompetent—with very, very few exceptions, this should never be needed. If in doubt, it is virtually always better to make a specific capability check*** than to check for e.g. specific browser. Writing websites that look good/function in all the major browsers, on all the major platforms, and even simultaneously in “desktop” and “mobile” versions****using the same contents is not that hard—and doing so ensures that the website is highly likely to do quite well in more obscure cases too.

*Today: In the past, this was not always so, with comparatively weak and highly non-standardized browser capabilities. I think back on my experiences with JavaScript and CSS in the late 1990s with horror.

**And what legitimate reasons do websites have to gather statistics on user agents? The answer is almost always “none”. The main reason that is even semi-justifiable is to optimize the website based on (mostly) the browser, and (cf. above) this is almost always a sign of a fundamentally flawed approach—and the solution is to write more generic pages, not to gather statistics. (In contrast, statistics like how many users visit at what hour or from what country can be of very legitimate interest. A partial exception to the above are major technological upheavals like the switch to HTML 5, but these are likely better handled by more central and generic statistics—or, again, specific capability checks.)

***For a trivial example, if a site needs JavaScript to function, it should check for JavaScript with or in combination with the “noscript” tag (not related to the NoScript plugin)—not whether a browser from a short list of known JavaScript capable browsers is used. The latter will give false positives when JavaScript is turned off and false negatives when a rarer-but-JavaScript-capable browser is used.

****If different versions are needed at all (dubious), this should be an explicit choice by the user. I note that I have very often preferred to use the mobile versions of various sites when on a desktop, because these typically are less over-wrought, are “cleaner”, have a lesser reliance on (unnecessary) JavaScript, come with less advertising, …

Unfortunately, a fad/gimmick/sham of the last few years has been adaptive web design. Attempts to apply this virtually entirely unnecessary and detrimental concept is the cause of much of the wish for e.g. knowing the browser, OS, screen size*, device type, … (other reasons relate to e.g. de-anonymization, profiling, and targeted advertising), to the point that some have wanted to detect the charge level of a mobile’s battery in order to adapt the page… The last is horrendous in several aspects, including an enormous patronization, the demonstration of a highly incompetent design (no page should ever, not even when the battery is full, draw so much power that this is a valid concern), great additional risks with profiling, and a general user hostility—if this was a legitimate issue, give the user an explicit choice: He might prefer to run everything at full speed when low on charge, because he knows that he will be home in ten minutes; he might prefer to run everything at minimum speed even with a full battery, because he is gone for the weekend and has forgotten his charger.

*Screen size might seem highly relevant to the uninitiated, but normally is not—a sufficiently generic design can be made for most types of content. With the rare exceptions, leave the choice to the user.

Written by michaeleriksson

September 27, 2018 at 4:08 pm

Posted in Uncategorized

Tagged with , , , ,

One Response

Subscribe to comments with RSS.

  1. […] mechanisms on the Internet are another ([1], [2]) common source of “missing the point”, be they global search-engines or […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: