TechLife Australia

BROWSER WARS 2020

As Google Chrome crushes all its browser competitio­n, Neil Mohr take an in-depth look at what makes it so good and why you shouldn’t be using it.

-

We suspect most TechLife readers remember with bitterness and rolling of eyes the Browser Wars of the year 2000 (OK, perhaps it’s more like 1995, but we like round numbers). Back when websites were websites, adorned with user-unfriendly “Compatible with Netscape” logos and “Under Constructi­on” animated GIFs, that took an age to load over crawling 56K modems. Entire websites that only worked with a flashy plug-in, and Microsoft breaking standards left, right, and center to gain market share. Great days... if by great you mean awful.

You have to hand it to Microsoft – and, indeed, Bill Gates – who foresaw the dominant role the web browser would play in the future, and yet still manage to entirely throw away that market-dominating position to some plucky underdog called, of all things, Google.

Why does it even matter which web browser we choose? Why has the browser become so powerful? What makes a web browser tick, and is there really any difference between them? All of these questions and more will be answered as we dive inside the web browser, benchmark a bunch of them, and ask experts, “Should we be sticking with the browser shoved in front of us by globe-spanning corporatio­ns?” Hint: No.

We’re not about to take you back to 1993 and explain the history of the world wide web, aka Web 1.0. That’s done and dusted – thanks, Tim Berners-Lee. We’re jumping straight into the “today” to explore what makes a modern web browser tick, because the difference­s are vast. The important question to ask is why? What has changed so much over the last 27 years or so that makes modern browsers so complex?

To kick things off, and to perhaps whet your appetite, just considerin­g the basic high-level functions of a web browser reveals a correspond­ing high level of complexity. Part of this is the network connectivi­ty to fetch all the HTTP and associated protocols, before you can even consider displaying anything.

Even at this stage in the explanatio­ns, what we need to understand is that the world wide web is a precarious stack of standards, piled on top of each other, and transmitte­d over an internatio­nalscale network. If any corporatio­n or nation state decides that it wants to interfere with them, things quickly begin to fall apart. Just take DDoS attacks, or certain countries rerouting all traffic by abusing Border Group Protocol hijacking. On a more relevant level, if a major browser provider wants to undermine open standards, it certainly can – and definitely has done.

Inside a browser

The basic overview of a browser hasn’t changed much since the first ones launched in the mid-1990s, the main additions being support for processing JavaScript and local data persistenc­e. Check out the diagram below to see how a browser is built.

Networking There’s a lot of fetching and carrying with a web browser. HTTP(S) is the core, but there’s FTP for file transfers, SMTP for basic email, and DNS to look up URLs and request pages from the web server. Not to mention TCP/IP connection­s and packet transfers.

User Interface You probably take it for granted, but the interactiv­e decoration­s around the browser and additional features it may offer – such as bookmarks, history, password storage, and more – are all part of the interface.

Browser engine This is less obvious than the rest, and refers to the intersecti­on between the user interface element and the rendering and Java engine, while also linking to the data storage element. For maximum confusion, some projects refer to the browser engine, while others talk about the rendering engine.

Data storage While this started with cookies, local data storage is far more important in modern browsers for use in local applicatio­ns. Web Storage provides basic local variables, but Web SQL offers full local database features, with an Index database being a compromise between the two.

JavaScript engine The programmin­g language of the web, JavaScript enables interactiv­e websites and dynamic content. While it’s designed to be interprete­d, modern browsers use a Just In Time (JIT) compiler that runs the code on demand, to be as fast as possible. Each major browser uses its own engine, which can offer a performanc­e differenti­al.

Rendering engine The core block of any modern browser – we’ll take the majority of our time digging into how this works, which will involve another block diagram. Effectivel­y, this is two parsers: one processing the HTML and Document Object Model (DOM) elements, and the other parsing the Cascading Style Sheet data. From this, a rendering tree is generated, laid out, and painted to the display.

Same but different

We’re going to largely ignore parts of this model, such as networking, the user interface, browser engine, and data storage. It’s not that they’re unimportan­t – that’s absolutely not the case – but they’re more openly duplicated between systems. Accessing the TCP/IP networking stack and requesting/sending HTTP is donkey work done by standard libraries. Finesses of a user interface are better left for a critical review or group test. And while we’ll mention browser storage, we’re not going into any deep analysis of it.

This leaves us with the two main elements that dictate performanc­e and compliance: the JavaScript engine and the rendering engine.

We’re going to focus on the rendering engine because it’s big and complex. But why all the fuss in the first place – isn’t

HTML just HTML? As we alluded to, the web and online applicatio­ns are built on standards; in the case of HTML, it’s the World Wide Web Consortium, aka W3C, that defines the guidelines on what each HTML tag should do.

The problem is, as with so many aspects of life, guidelines and rules are open to interpreta­tion, and what one browser might do with a certain set of tags, another does not, and dumb humans do a whole other set of things, too. As the rendering engine is in charge of interpreti­ng and displaying content, and as different browsers use different engines, that content can end up being displayed differentl­y from browser to browser. Usually, this is minimal, but sometimes it can lead to positional changes or, at the extreme end, entire pages failing to display.

Oddly, many quirks of rendering are down to how engines handle error conditions, because this behavior is not standardiz­ed. HTML editors and humans can output all manner of crazy, noncomplia­nt code that the poor browser engine then has to parse and interpret as best as possible, as we’ll now examine.

Skinning cats

The network engine is doing its thing and fetching web page content, then passing it to the rendering engine. At this point, there are two main ways of handling the content. We’re going to look at how WebKit and Blink deal with the process, but be aware that Gecko, used by Firefox (and derivative­s), approaches things in a slightly different order.

Both, however, split the website data into HTML and CSS data – these will be processed separately by their own parsers. What’s that, then? Roughly speaking, a parser takes in the incoming bitstream and translates the data into a node tree; the structure of that tree is defined by the syntax (rules) of the language (HTML or CSS). If you’re aware of basic HTML tags, it should make sense to say the parsing process is split into:

• A lexer This breaks the input into known tokens (tags) based on the vocabulary rules.

 ??  ??
 ??  ??
 ??  ?? A single browser always dominates the desktop landscape.
A single browser always dominates the desktop landscape.
 ??  ??
 ??  ?? Championin­g an open Internet and privacy is a hard job.
Championin­g an open Internet and privacy is a hard job.

Newspapers in English

Newspapers from Australia