Sound & virtualisation

Before the WebKit fork we were one of the largest contributors, behind Apple and Google

2018-02-13 -

Alberto Garcia on QEMU and Spanish open source

Alberto Garcia is a developer for Igalia, a rather successful open source consultancy company. From humble beginnings in their native Galicia, in northwestern Spain, Igalia has become a truly international organisation that are involved with all kinds of open source technologies, from browsers and web engines to – Alberto’s own forté – virtualisation with KVM and QEMU. We caught up with him during October’s 2017 Open Source Summit at the Prague Hilton.

Linux Format: How did you get into computers, Linux, programming and the related dark arts?

Alberto Garcia: I had an early interest in computers. I think I received my first one aged 11 as a Christmas present from my parents. Back then games consoles weren’t so popular in Spain, so I got a Spectrum and learned how to program on it. Around that time there was a lot of technical stuff in Spectrum magazines, too: they weren’t just about games, there was a lot of details about computer internals in them. So I learned a lot about that from typing out code in the magazine.

LXF: Betraying my age slightly, I too remember the days of PEEK and POKE

AG: Exactly. Those were the days. So by the time I was writing my first games and small programs I decided that I liked computers and wanted to work with them. So I studied computer science in 1996 and that’s where I learned about Linux, too. It wasn’t particularly popular back then.

LXF: I tried to get Slackware working around that time. I was unsuccessful.

AG: I think the first distro that I tried was called MiniLinux. You could install it from MS-DOS. It used this filesystem called umsdos that implemented POSIX on top of the FAT filesystem. So you could unpack a zip file and run the Linux kernel without having to repartition anything.

Soon after this I moved to Debian. It had the largest selection of software and you could install it from CDs. I liked Debian a lot. I liked being able to see all the details of how everything works under the hood. Back then I felt like I had pretty much mastered MS-DOS, that there were no more secrets there, so Linux became the new challenge. Throughout our studies we had to use UNIX a lot and I discovered I liked that too, so I figured it would be nice if I could actually work on this.

Most of my peers were, in their professional lives, working on Java, Windows NT or whatever boring stuff was popular back then. But I was more interested in Linux, so when I graduated

Alberto’s debian install “The original install is still there, too: there are still files from 20 years ago.”

I, along with some classmates, founded a company. Our goal was to work on Linux and do open source in general, and that company was Igalia. We started working locally. The market was very different back then. Most of the things you see now didn’t really exist, so it was a very different company. But we very quickly became involved with the Gnome project. Little by little, we started to work less with local customers and local development and more with upstream communities, contributing directly to those projects.

LXF: Is Debian still your distro of choice?

AG: It is actually. I’ve been a Debian maintainer for some years now, too. In fact, I think I’ve still got the version of Debian that I installed back in 1997, or whenever it was. I copied the data to a new hard drive and put in a new machine, and that copy has been kept up to date. But yeah, the original install is still there, too: there are still files from 20 years ago. I like Debian’s centralised development model. There’s very little hierarchy – I like how it’s run. For me it was natural to start contributing to it. I started by maintaining a couple of packages, but now I’m a bit more involved. Our company maintains the WebKitGTK+ port and I package it for Debian. If there’s a problem, Debian enables us to see how the package runs in different architectures and helps with the upstream development.

LXF: Tell us a little more about Igalia

AG: We’re an open source consultancy company based in A Coruña in the Galicia region of Spain. We were founded in 2001 with two goals: open innovation and FLOSS development. In the beginning there were 10 of us, but now we have 60 engineers, distributed across several countries.

More than half the company works with web browsers and associated technologies: we do some work with Mozilla, quite a lot with WebKit and, lately, a lot of Chromium too. We do a lot of graphics work, too. We work with Intel on Mesa development, as well as lots of other things related to toolkits and optimisation in general. We work with compilers and virtual machines, too. We’re mostly focused on JavaScript, so we have a team working with Google on V8, a team working with Mozilla on SpiderMonkey, and a team working with Apple.

We also do lots of multimedia stuff, this month we were at the Gstreamer conference and hackfest, which was also in Prague. We maintain its WebKit backends and also contribute to core development as well. Those areas probably account for about 90 per cent of what we do, we also work on Accessibility, we contribute to the kernel and we have a small team working on a next-gen networking toolkit called Snabb (see www.igalia.com/

networking). I’m involved with the virtualisation side of things, other people in this team work with related cloud technologies, such as the CEPH storage system.

Historically, we’ve worked on all sorts of different projects. We’re in lots of different areas today, but I’d say the core one is web browser engines. Ten years ago we started maintaining the WebKitGTK+ port, and most of the development there was done by us. We discovered that there was a lot of interest in having an open source web engine that manufacturers can use in their products.

Before the WebKit fork we were the one of the largest contributors, behind Apple and Google. So we grew in that area and started some new projects in closely related areas. Web engines are connected to areas of the whole stack, so we need people that know about multimedia, compilers (for the JavaScript interpreter) and so on. Nowadays, we have people working on the web core, implementing new standards and features; people working on the JavaScript compiler, making it faster; multimedia, people working on GPU acceleration. Aside from that we have a lot of experience working with open source communities and have other teams working in different areas.

LXF: It’s hard to keep up with WebKit, WebKitGTK+, Blink and all the other web engines. Can you give us a bit of background about how WebKitGTK+ came about?

AG: I wasn’t initially part of that team, but we were very involved with the Gnome desktop environment and Gnome had, and still has, a browser called Epiphany. This used the Gecko web engine, but as far as I remember, Gecko was not designed to be embedded. It was designed as a web engine for Firefox, so if you wanted to use it somewhere else it didn’t quite fit. WebKit on the other hand was designed to be just a web engine, and so was built with embeddability in mind. So it seemed like a good idea to add a GTK layer on top of it, so that we could use it in Gnome. Now it’s used not just by Epiphany, but all kinds of other Gnome applications that need to display HTML or web content in general.

LXF: What is the focus of your work?

XX: I’m working with QEMU, particularly the disk I/O and storage side of things, so not really anything to do with web browsers. In the past I’ve worked on virtualisation of hardware, kernel device drivers and suchlike.

LXF: This is quite an action-packed conference: besides incorporating LinuxCon and ContainerCon, there’s also MesosCon, the Kernel Summit and, most relevantly for you I guess, the KVM Forum.

AG: Yes it’s great these things are co-located. I can be manning the booth, promoting the company. But I’ll also give a KVM talk for the Forum later. It’s a great opportunity to meet people from different communities.

LXF: Can you give us a preview of your talk?

AG: Sure. So I’m working with the QEMU hypervisor. When you create a virtual machine you need to store its data somewhere, and

QEMU has this native file format called QCOW2. QCOW2 has lots of features. You can create backing files, snapshots, encryption, compression and it grows on demand. So when you start with an empty device it takes up only a few hundred kilobytes, and when you add to it, it grows. This is nice, but it doesn’t perform as well as a raw file, say a 20GB file that you access as a block device. A raw file offers the fastest possible performance here.

With QCOW2 there are some overheads. You need some data structures to handle the snapshots, the translations between the addresses the VM and host see with the file, and so on. My work is to reduce that overhead and make QCOW2 perform as fast as raw files. In many cases it does, but there are a few cases were QEMU still needs to be tuned, a few where the QCOW2 format itself needs to be extended, and a few where we need to improve the driver.

LXF: For someone used to VirtualBox, it can be hard to see how the kernel, KVM and QEMU all fit together. Can you provide some insight here?

AG: KVM (Kernel Virtual Machine) is the component in the Linux kernel that enables it to operate as a hypervisor. So it’s a kernel module that means userspace programs can create virtual machines. KVM is independent of QEMU, so you can create a virtual machine using just the KVM API, specifying the number of cores, amount of memory and so on. KVM does all of that and gets the kernel to do all the low-level CPU scheduling. QEMU provides the emulation of all the input and output device drivers, and all the hardware. So basically KVM just creates the VM and isolates the call, and lets everything run there. QEMU provides emulation of disk devices, displays… all of these sorts of things.

LXF: I’ve been using Virt-manager for all my virtual machines for a while, and that leverages the libvirt API to manage VMs. Can you explain where this fits into the scheme of things?

AG: So QEMU in its most basic form is a userspace program, it’s a desktop program that you can launch on your own computer. Libvirt is a library that enables you to control virtual machines. You can create them, start them, stop them and destroy them. It’s an API that works on top of QEMU, but it also works on top of all kinds of other virtualisation options. So you can have something above it that handles VMs, and libvirt provides an abstraction layer between the VM provider itself, and the application that manages everything.

LXF: Our readers will be familiar with the idea of snapshotting VMs, but tell us about the other advantages of QCOW2.

AG: Compression is pretty handy, so QCOW2 images are divided into clusters (similar to blocks in a filesystem) and you can compress those clusters. The resulting image is much smaller that what you would get with a raw file. There’s an overhead involved because whenever you read a cluster you have to decompress it, but it’s a trade-off. It also supports encryption; this year it got LUKScompatible encryption. It also supports backing files that are closely related to snapshots.

So you have a base file with a base image and an active file where data is written. When you need to read data that isn’t in the base file you go to the new file. In this way you can have several different VMs using the same guest OS and share common data between them.

LXF: What about resilience benefits? If I pull the plug while my QCOW2-backed VM is running, is my data still going to be there when I fire it up again?

AG: Well the QCOW2 format checks that every time all the writes to disk are done in a way that if the VM crashes the file isn’t corrupt. But if it is corrupted, then at least we can detect it. So when we develop the format, we have to take into account the ordering of the writes and try to make sure data isn’t lost. If data is lost then we have to make sure we can detect that.

LXF: QEMU has come a long way. I remember using it five years ago and getting scared and running back to the comfort of VirtualBox, but now I use it all the time. What are some new features that we can look forward to?

AG: One of the most important data structures in the QCOW2 format is the one that translates guest addresses to host addresses. In other words, to access the disk you first need to read the tables that contain those translations, and, to be performant, that requires a cache. Otherwise you’d have to read those tables twice for every disk access.

Currently, that cache works, but I’m working on some code to make it slightly more efficient. It enables you to have a more fine-grained cache entry, and so that with the same amount of cache memory you can achieve a better level of performance.

LXF: I’ve heard something about a QEMU advent calendar last year. What was the story there?

AG: A few people in the QEMU community created a website that offered a different disk image every day for the month of December 2016. There were some classics in there. I contributed a Spectrum image with seven games developed in this very decade. There was also a tiny chess game, BootChess, so-called because it’s small enough to fit into a 512-byte boot sector. There were also some more useful stuff, such as the WireGuardVPN and the Rustbased Redox OS. You can, and should, check it out at www.qemu-advent-calendar.org/2016. There’s also a 2014 edition, too!

LXF: You can find out more about Igalia and their work with WebKit and Wayland in next month’s interview with Alberto’s fellow Igal-ite Juan José Sánchez Penas.

Sound & virtualisation

Before the WebKit fork we were one of the largest contributors, behind Apple and Google

Newspapers in English

Newspapers from Australia

Sound & virtualisa­tion

Before the WebKit fork we were one of the largest contributo­rs, behind Apple and Google

Newspapers in English

Newspapers from Australia

Sound & virtualisation

Before the WebKit fork we were one of the largest contributors, behind Apple and Google