Michael Eriksson's Blog

A Swede in Germany

Posts Tagged ‘HTML

WordPress and mangling of quotes

with 4 comments

Preamble: Note that the very complications discussed below make it quite hard to discuss the complications, because I cannot use the characters that I discuss and expect them to appear correctly. Please make allowances. For those with more technical knowledge: The entity references are used for what decimal Unicode-wise is 8220 / 8221 (double quotes) and 8216 / 8217 (single quotes). The literal ones correspond to ASCII/Unicode 34, which WordPress converted to the asymmetric 8220 and 8221. (I stay with the plain decimal numbers here, lest I accidentally trigger some other conversion.)

I just noticed that WordPress had engaged in another inexcusable modification of a text that I had posted as HTML by email—where a truly verbatim use of my text must be assumed.* Firstly, “fancy”** or typographic quotation marks submitted by me as “entity references”*** have been converted to literal UTF-8, which is not only unnecessary but also increases the risk of errors when the page or a portion of its contents is put in a different context.**** Secondly, non-fancy quotation marks that I had deliberately entered as literal UTF-8 had been both converted into entity references and distorted by a “fanciness” that went contrary to any reasonable interpretation of my intentions. Absolutely and utterly idiotic—and entirely unexpected!

*Excepting the special syntax used to include e.g. WordPress tags, and the changes that might be absolutely necessary to make the contents fit syntactically within the displayed page (e.g. to not have two head-blocks in the same page).

**I.e. the ones that look a little differently as a “start” and as an “end” sign. The preceding sentence should, with reservations for mangling, contain two such start and two such end signs in the double variation. This to be contrasted with the symmetrical ones that can be entered by a single key on a standard keyboard.

***A particular type of HTML/XML/whatnot code that identifies the character to display without actually using it.

****Indeed, the reason why I use entity references instead of UTF-8 is partially the risk of distortion along the road as an email (including during processing/publication through WordPress) and partially problems with Firefox (see excursion)—one of the most popular browsers on the web.

The latter conversion is particularly problematic, because it makes it hard to write texts that discuss e.g. program code, HTML markup, and similar, because there the fancy quotes are simply not equivalent. Indeed, this was specifically in a text ([1]) where I needed to use three types of quotation marks to discuss search syntax in a reasonable manner—and by this introduction of fanciness, the text becomes contradictory. Of course, cf. preamble, the current text is another example.

This is the more annoying, as I have a markup setup that automatically generates the right fancy quotes whenever I need them—I have no possible benefit from this distortion that could even remotely compete with the disadvantage. Neither would I assume that anyone else has: If someone deliberately chooses to use HTML, and not e.g. the WYSIWYG editor, sufficient expertise must be assumed, especially as the introduction of fancy quotes is easy within HTML it self—as demonstrated by the fact that I already had fancy quotes in the text, entered correctly.

Excursion of Firefox and encoding:
Note that Firefox insists on treating all* local text as (using the misleading terminology of Firefox) “Western” instead of “Unicode”, despite any local settings, despite the activation of “autodetect”, despite whatever encoding has actually been used for the file, and despite UTF-8 having been the only reasonable default assumption (possibly, excepting ASCII) for years. Notably, if I load a text in Firefox, manually set the encoding to “Unicode”, and then re-load the page, then the encoding resets to “Western”… Correspondingly, if I want to use Firefox for continual inspection of what I intend to publish, I cannot reasonably work with pure UTF-8.

*If I recall an old experiment correctly, there is one exception in that Firefox does respect an encoding declared in the HTML header. However, this is not a good work-around for use with WordPress and similar tools, because that header might be ignored at WordPress’ end. Further, this does not help when e.g. a plain-text file (e.g. of an e-book) is concerned. Further, it is conceptually disputable whether an HTML page should be allowed to contain such information, or whether it should be better left to the HTTP(S) protocol.


Written by michaeleriksson

November 29, 2018 at 8:27 pm

A discussion of naive use of links

with 2 comments

People with a poor understanding of the Web and/or “hyper text” often add links in odd and suboptimal ways, including what I think of as “here”* links, which make the text specific to the medium, can confuse readers, and/or cause problems for automatic and semi-automatic tools (screen readers, link listers, search engines, whatnots). A typical example**:

*Because of the common use of the word “here”.

**All examples adhere to patterns that I have seen on many occasions; however, none are direct quotes. I mark the linked portions of the text by an opening and a closing “#”. For instance, “a #b# c” implies a text of “a b c” with a link on the “b”. (Using real links could cause problems very similar to those I advice against with this text, and has the added complication that the difference between e.g. “#a# #b#” and “#a b#” is not always obvious.)

Smith has written an extensive report on X. You can find the report #here#.

Consider e.g. how this would look in a link list: Now we have a link identified only by the word “here”… Consider what happens after printing, conversion to plain-text, or similar: What is “here” even supposed to imply? “You can find the report here.” leads to the obvious question “Where?!?”, which cannot be answered from the text it self. (And there need not be any indication in the text that this resulted from a link being lost, since coloring and, often, underlining will also be lost.) Consider the lack of information as to where the link leads. Consider how search-engines are hampered in their attempts to make classifications and evaluations. Etc.

Of course, these problems are not restricted to “here”. Consider variations like:

Smith has written an extensive report on X. #Use this link to download.#

Unsurprisingly, it is common for “here” links to occur repeatedly in the same document, often in the same sentence:

You can find the report #here#.
Other sources can be found #here#, #here#, #here#, and #here#.
For more on X click #here#. For more on Y click #here#.

Now the problems are made that much worse, because the links are indistinguishable without further investigation.

Contrast the first example with:

Smith has written #an extensive report on X#.

This avoids most* of the above problems, is more informative, more user-friendly, and shorter to boot. Other alternatives are possible, especially when some reformulation is allowed. For instance, “#Smith has written an extensive report on X#.” is even more informative, but makes the link unnecessarily long (which is why I preferred the chosen version).

*When moving to another medium, the link is still lost; however, the result is at least less confusing. (This could be avoided by spelling out the full link, which is legitimate and sometimes the best thing to do. Much more often, the negative effects on the HTML view would be too large through breaking a natural text flow or taking up too much space, as with e.g. #https://www.fictional-college.edu/~john.smith/my-extensive-report-on-X.html#.)

Similarly, what if the middle part of the third example had been:

Other sources include #an interview with Mike Tyson#, #the book XYZ#, #a NASA study#, and #Smith’s Ph.D. thesis#.*

*The exact texts to use are open to discussion, depending on factors like how the author prioritizes informative links vs. short links, what is clear from context, and similar. Going down to e.g. “[…] include #Mike Tyson# […]” might, depending on context, keep enough information, and would make the link shorter.

Ideally, the resulting text should read in a manner that is agnostic of the medium and could be moved unchanged to e.g. a (printed) news-paper article, as with a regular text that has simply been enriched with links for the convenience of the reader; ideally, the text of the link should have sufficient explanatory value that the reader has a good expectation of what to find even when just looking at a link list, but, at a minimum, when looking at the full sentence of the link. (Which is not to say that these ideals are always realistic or that I, myself, always keep them in mind—I do not. However, by making links even somewhat sensible, and by categorically avoiding nonsense like “You can find the report #here#.”, most of the problems automatically disappear. Compromises that I often deliberately make include “#an older text#” and the “#[1]#” discussed below.)

The alternative of using “#[1]#”, “#[2]#”, etc., is a hybrid between “good” and “bad” linking. Compared to “#here#”, such links have the advantage of being unique, being easily recognizable as references in other contexts (e.g. after printing), and allowing an easier transition to another medium by extending the text with a set of explanatory references (cf. how Wikipedia handles references*). They also allow for easier intra-text references. They still share disadvantages like the linked text not being very informative. I use this alternative fairly often, especially when several links are given at once and/or I will reference the same source later on in the text—however, I stress this is a compromise between effort and result.

*I.e. as combination of bracketed numbers in the text that refer to a later section with more information, including the full link, book title, author, or whatever might apply. In terms of results, this is arguably a better solution even for HTML; however, it implies a lot more work than if a link is put directly on the bracketed number, and will be impractical in many contexts. To boot, the effort for the reader can also be increased unduly when he is expected to actually visit the linked-to source in a high proportion of the cases (which is not the case with Wikipedia).

Excursion on other problems:
Other common problems include failing to indicate that a link leads to non-HTML content (e.g. a PDF-file), causing unexpected behavior* when the link is clicked; and forcing the opening of external links into new windows/tabs, contrary to the expectations of a reasonable user, potentially breaking tabbed browsing**, and violating the principle that the user should be in control: If the reader wants a page to open in a new tab/window, he actively does so. If he does not, it is not the right of some far away author to override his decision.

*Including opening additional applications (or a sub-standard browser view), often in combination with focus stealing; longer download times; and bandwidth wasted for those on a slow or non-flatrate connection.

**Even when the page is opened in a new tab, the result rarely fits the normal workflow of tabbed browsing, i.e. that tabs are opened in the background for later viewing, while the user remains on the original page. If the browser does not support the conversion of “new window” requests into “new tab” requests, the results can be far worse.

Excursion on too specific assumptions in general:
Youtube provides many examples of making too specific assumptions. For instance, a video that asks the users to “comment below” might become misleading even through a minor Youtube redesign. Others, e.g. “please ‘like’ this video” might survive even a drastic redesign, but would still be irrelevant if moved to or viewed in another context, e.g. after a manual download.

Blogging comes with potentially similar problems, but is rarely as bad (likely because bloggers are less obsessed with “likes” and subscriptions); however, I advice being careful about using highly blogging specific terminology for a text that might later appear in another context—just like a book or news paper written in the era of e-books and online news should avoid speaking of the paper it is (not necessarily) printed on. (I acknowledge that I have often violated my own advice.)

Many instructions for computer and whatnot use make far too many assumptions. Consider e.g. giving users instruction to use a certain key shortcut that is browser specific (or even to “click” on a link), telling him to start a certain program, giving him OS instructions that require Windows (or even a specific version of Windows), giving instructions on how to start an application through menus in a specific language (instead of giving the name of the application to start), …

Written by michaeleriksson

August 16, 2018 at 8:20 am

Posted in Uncategorized

Tagged with , , , ,

The common design problem of CSS and position: fixed

leave a comment »

One of the greater* mistakes in the history of the Web is the idiotic CSS instruction “position: fixed”. This instruction causes a piece of the page, usually the top navigation menu, to remain at the same position relative the browser window—instead of relative the web page. Effectively, objects counter-intuitively and annoyingly remain in sight even when the user scrolls.

*My first draft had “greatest”. Then a great number of other web idiocies occurred to me, including such astonishing mistakes as Flash (slowly dying) or the ability for a web site to manipulate the user’s browser history (long gone). Unfortunately, many of the collaborators on and inventors of various Web technologies have been idiots and/or self-serving at the cost of the users. A particular problem, of which “position: fixed” is a good example, is neglecting the interests of and control by the users in favor of the interests of and control by the web sites—quite contrary to the original spirit of HTML.

There are extremely few sensible use cases for this. In fact, of the top of my head, I cannot name a single one. They are bound to exist, but when someone who has spent more than two decades as an avid surfer and sometimes professional web developer cannot name one…

Unfortunately, it is used by more and more sites to implement use cases that are not sensible. Take the aforementioned top navigation menu: This permanently steals screen space from the actual contents of the page without, normally, bringing any benefit to the user. If the menu is present at the top of the page (not window) through e.g. a “position: absolute”, screen space is only lost when looking at the top of the page. After scrolling down, the entire window is used for content, and in the (for most websites) rare cases that the user wants to go back to the top menu, he can do so with one fell click of a button. Nevertheless, these insensible use(case)s have grown so common that it is almost hard to find a website who has not fallen pray to at least one…

This is particularly annoying, because modern displays are almost always* in the 16:9 format, which is far flatter than the old 4:3 or 5:4 formats, and many or most users are underway on notebooks that have smaller screens than desktop displays and often a lower resolution to boot. For instance, I currently write on a notebook with a screen 768 pixel and roughly eight inches tall—a standard reached by many or most “old” monitors in the 1990s (pixel) or even 1980s (inches)! (That my 1366 pixel of width would have been truly outstanding in the 1980s is no comfort in situations like these.)

*Except in the mobile area, where screen space is even more expensive to begin with and the negative effects are even larger…

Not to forget: These 768 pixel must be shared with other items too, including (in my case) the title bar of the browser window, the top and bottom border of the browser window (albeit minimized to 1 pixel), the browser menu, the browser tab bar, and the browser address menu. Many others will have even less space available because they have an OS-taskbar at the bottom of the screen (I have it to the left side) or because they have disabled fewer this-and-that bars in their respective browsers. In the early graphical web browsers of the 1990s there was less such overhead and correspondingly more horizontal screen space.

Take the recent, utterly idiotic*, redesign of FML: There is now a “fixed” top menu that takes up about 140 pixel. Add in the some hundred pixel used for browser bars (and the like), and there is roughly 500 pixel available for the contents (some other users could have less than 400 on the same monitor)—we are effectively back to the ancient VGA resolution! Combine this with a large increase in default spacing and font sizes, and a browser window now shows me two or, on the outside, three entries at a time. Before the redesign, there were twice or thrice as many.

*Other problems include poorly chosen colors, a hard-to-read layout, a chaotic navigation, removal of the paging, … The old version, in contrast, was easy to read, user friendly, relaxing on the eyes, and provided more content per browser window. It might not have won any prizes for avant-garde design; however, that simply should not be a concern for user-friendly website, which should focus on making life easier for the visitors. Indeed, the result is so utterly idiotic that I might give the site up—and had actually planned to make this post about FML… (I re-prioritized in light of encountering unusually many examples of the fixed top navigation menu today—not to mention a smaller-but-still-ill-advised fixed bottom menu on one of my other favorite sites, online dictionary LEO .) As an aside: It is truly depressing that most re-designs of websites decrease usability in favor of some ill-advised attempt to be “flashy”, “cool”, “interesting”, whatnot.

My advice to web developers: Never use this feature. (If some type of manager demands it, explain why it is a user unfriendly to user hostile idea.)

My advice to web surfers: If one of your favorites adds a new one, complain. The chance that someone listens is small, but it exists—and it is the greater the more people complain. (Complaining about all uses encountered would be an unrealistic task.)

Written by michaeleriksson

May 4, 2016 at 11:34 pm

Posted in Uncategorized

Tagged with , , , , , ,