BROWSER WARS 2020
As Google Chrome crushes all its browser competition, Neil Mohr take an in-depth look at what makes it so good and why you shouldn’t be using it.
We suspect most TechLife readers remember with bitterness and rolling of eyes the Browser Wars of the year 2000 (OK, perhaps it’s more like 1995, but we like round numbers). Back when websites were websites, adorned with user-unfriendly “Compatible with Netscape” logos and “Under Construction” animated GIFs, that took an age to load over crawling 56K modems. Entire websites that only worked with a flashy plug-in, and Microsoft breaking standards left, right, and center to gain market share. Great days... if by great you mean awful.
You have to hand it to Microsoft – and, indeed, Bill Gates – who foresaw the dominant role the web browser would play in the future, and yet still manage to entirely throw away that market-dominating position to some plucky underdog called, of all things, Google.
Why does it even matter which web browser we choose? Why has the browser become so powerful? What makes a web browser tick, and is there really any difference between them? All of these questions and more will be answered as we dive inside the web browser, benchmark a bunch of them, and ask experts, “Should we be sticking with the browser shoved in front of us by globe-spanning corporations?” Hint: No.
We’re not about to take you back to 1993 and explain the history of the world wide web, aka Web 1.0. That’s done and dusted – thanks, Tim Berners-Lee. We’re jumping straight into the “today” to explore what makes a modern web browser tick, because the differences are vast. The important question to ask is why? What has changed so much over the last 27 years or so that makes modern browsers so complex?
To kick things off, and to perhaps whet your appetite, just considering the basic high-level functions of a web browser reveals a corresponding high level of complexity. Part of this is the network connectivity to fetch all the HTTP and associated protocols, before you can even consider displaying anything.
Even at this stage in the explanations, what we need to understand is that the world wide web is a precarious stack of standards, piled on top of each other, and transmitted over an internationalscale network. If any corporation or nation state decides that it wants to interfere with them, things quickly begin to fall apart. Just take DDoS attacks, or certain countries rerouting all traffic by abusing Border Group Protocol hijacking. On a more relevant level, if a major browser provider wants to undermine open standards, it certainly can – and definitely has done.
Inside a browser
The basic overview of a browser hasn’t changed much since the first ones launched in the mid-1990s, the main additions being support for processing JavaScript and local data persistence. Check out the diagram below to see how a browser is built.
Networking There’s a lot of fetching and carrying with a web browser. HTTP(S) is the core, but there’s FTP for file transfers, SMTP for basic email, and DNS to look up URLs and request pages from the web server. Not to mention TCP/IP connections and packet transfers.
User Interface You probably take it for granted, but the interactive decorations around the browser and additional features it may offer – such as bookmarks, history, password storage, and more – are all part of the interface.
Browser engine This is less obvious than the rest, and refers to the intersection between the user interface element and the rendering and Java engine, while also linking to the data storage element. For maximum confusion, some projects refer to the browser engine, while others talk about the rendering engine.
Data storage While this started with cookies, local data storage is far more important in modern browsers for use in local applications. Web Storage provides basic local variables, but Web SQL offers full local database features, with an Index database being a compromise between the two.
JavaScript engine The programming language of the web, JavaScript enables interactive websites and dynamic content. While it’s designed to be interpreted, modern browsers use a Just In Time (JIT) compiler that runs the code on demand, to be as fast as possible. Each major browser uses its own engine, which can offer a performance differential.
Rendering engine The core block of any modern browser – we’ll take the majority of our time digging into how this works, which will involve another block diagram. Effectively, this is two parsers: one processing the HTML and Document Object Model (DOM) elements, and the other parsing the Cascading Style Sheet data. From this, a rendering tree is generated, laid out, and painted to the display.
Same but different
We’re going to largely ignore parts of this model, such as networking, the user interface, browser engine, and data storage. It’s not that they’re unimportant – that’s absolutely not the case – but they’re more openly duplicated between systems. Accessing the TCP/IP networking stack and requesting/sending HTTP is donkey work done by standard libraries. Finesses of a user interface are better left for a critical review or group test. And while we’ll mention browser storage, we’re not going into any deep analysis of it.
This leaves us with the two main elements that dictate performance and compliance: the JavaScript engine and the rendering engine.
We’re going to focus on the rendering engine because it’s big and complex. But why all the fuss in the first place – isn’t
HTML just HTML? As we alluded to, the web and online applications are built on standards; in the case of HTML, it’s the World Wide Web Consortium, aka W3C, that defines the guidelines on what each HTML tag should do.
The problem is, as with so many aspects of life, guidelines and rules are open to interpretation, and what one browser might do with a certain set of tags, another does not, and dumb humans do a whole other set of things, too. As the rendering engine is in charge of interpreting and displaying content, and as different browsers use different engines, that content can end up being displayed differently from browser to browser. Usually, this is minimal, but sometimes it can lead to positional changes or, at the extreme end, entire pages failing to display.
Oddly, many quirks of rendering are down to how engines handle error conditions, because this behavior is not standardized. HTML editors and humans can output all manner of crazy, noncompliant code that the poor browser engine then has to parse and interpret as best as possible, as we’ll now examine.
Skinning cats
The network engine is doing its thing and fetching web page content, then passing it to the rendering engine. At this point, there are two main ways of handling the content. We’re going to look at how WebKit and Blink deal with the process, but be aware that Gecko, used by Firefox (and derivatives), approaches things in a slightly different order.
Both, however, split the website data into HTML and CSS data – these will be processed separately by their own parsers. What’s that, then? Roughly speaking, a parser takes in the incoming bitstream and translates the data into a node tree; the structure of that tree is defined by the syntax (rules) of the language (HTML or CSS). If you’re aware of basic HTML tags, it should make sense to say the parsing process is split into:
• A lexer This breaks the input into known tokens (tags) based on the vocabulary rules.