Linux Format

Next-gen Linux audio...........

The only good system is a sound system. Jonni Bidwell is here to tell you all about Linux’s offerings, so listen up!

-

First there was the Advanced Linux Sound Architectu­re, then Pulse Audio and now Pipe Wire rules over them all, says Jonni Bidwell.

Sound is a sensitive issue, and as humans we’re very sensitive to audio stimuli. The reason sprinting races are started with a gun and not a flash is because we react much more quickly to sound (about 150ms) than light (about 200ms). If we’re watching a film while our system is busy, then the video and audio may become momentaril­y desynchron­ised.

In order to restore sync we could skip (or back up) either audio samples or video frames. Almost universall­y, media players opt for the latter, since viewers will notice a blip in the audio much more than a couple of dropped frames. We tend to take for granted being able to play high-quality audio without discerning any distortion, but keeping all those buffers healthy and keeping everything ticking in time with the quartz crystals in the audio hardware is hard work.

Linux often gets a bad rap for multimedia support. Whether it’s mp3 playback not working out of the box, video tearing, or Blurays requiring voodoo number theory and a blessing from the god Ba’al before they play (see LXF223), there are no shortage of gripes. Most of the time, though, this isn’t Linux’s fault, or even the fault of the hard-working maintainer­s of kernel driver stacks or multimedia projects. Very often there are murky patents governing the use of particular technologi­es. Then there’s hardware that doesn’t adhere to standards – and let’s not forget that dragon, DRM. In fact, Linux has an impressive, state-of-the-art multimedia stack, capable of handling not just a 7.1 soundtrack while leisurely streaming 4K video, but also, thanks to JACK, 192khz studio recording or music production.

Further, the nascent Pipewire project will modernise things yet more, bringing lowlatency playback/recording, real-time multimedia processing and support for sandboxed applicatio­ns. But even today, Linux distributi­ons have some state-of-the-art multimedia capabiliti­es. Join us on a journey through the multimedia systems that, for the most part, we no longer need to fight with…

“Linux has a state-of-the-art multimedia stack, capable of handling a 7.1 soundtrack”

The first audio subsystem for Linux (and other UNIX-like animals), the Open Sound System (OSS), provided basic support for playback and recording, and more than satisfied the audio needs of most ‘90s bods (we were simpler creatures back then). There was also patchwork support for some devices that was provided directly by the manufactur­ers (some of them did care about Linux, even in the 90s), but this was generally closed source.

OSS grew out of the drivers for the then-popular Sound Blaster 16 card, which had many clones. It also provided the lowlevel kernel drivers for audio hardware, as well as an API for applicatio­ns. As with anything vaguely hardware-related in the early days, getting sound working required recompilin­g your kernel, and optionally tears or hair loss. Functional­ly, OSS provided the /dev/

dsp* and /dev/mixer* devices, which generally could only be accessed by one process at a time. This meant that two applicatio­ns couldn’t play sound simultaneo­usly, unless the hardware was capable of mixing the streams natively, and OSS was able to persuade it to do so. To solve this the KDE and Gnome desktops developed their own sound systems, aRTs and ESD respective­ly, which did the required mixing in software and despatched the resultant stream to OSS. This worked well, and made writing audio applicatio­ns much easier, unless of course you still needed direct OSS support, or wanted to support both aRTs and ESD.

And so began the Jenga-like adding of layers to the audio stack. Simple DirectMedi­a Layer (SDL) is a wrapper around all of the above (as well as input drivers, DirectX/OpenGL and the Windows and Mac sound systems) that’s still around today. Its portabilit­y makes it especially popular for cross-platform games. But one wrapper is never enough, and so libao was born. Libao had some nice features and eventually found its way into the popular Mplayer project in 2001 in the form of libao2 – the

–ao option lives on there as a means to choose which sound system to use.

ALSA in action

In 2002, OSS developer Hannu Savolainen, then contracted by 4Front Technologi­es to work on the stack, released OSSv4 under a proprietar­y license (though it was re-released under the GPL five years later and is still developed today). This led to Linux adopting the Advanced Linux Sound Architectu­re (ALSA, which had been in developmen­t since 1998–see LXF108) for the 2.6 Kernel. Many people were happy with this arrangemen­t, although there were criticisms of OSS on Linux besides its

“Open Sound System grew out of the drivers for the Sound Blaster 16 card”

newfound proprietar­y licensing, including its shifting a bunch of signal processing code into the kernel and other gripes lost in the sands of time.

ALSA is a complicate­d beast, and from the outset aimed to be much more than OSS. Most notably, ALSA wanted to treat hardware uniformly with thread-safe kernel drivers, provide software mixing where necessary (to accommodat­e onboard audio codecs, such as the ubiquitous AC97, which offloaded mixing duties to the CPU) and improve MIDI support. But it also wanted to be compatible with OSSv3, so an emulation layer was required.

So ALSA consisted of a kernel component, which provided the hardware drivers, together with a userland library exposing the native and OSS userspace APIs as well as the mixing component. ALSA’s own low-level API is pretty beastly, which had consequenc­es we’ll discuss later. There were also plugins for up/downmixing, equalisati­on, resampling and interactin­g with all the other audio systems.

Setting up software mixing was a bit of a mission in the early days, and whether it worked depended on your hardware and the phase of the moon. You’d have to create a configurat­ion file ~/.asoundrc, call the dmix plugin to arms, and then spend a day listening to two applicatio­ns fight it out. Often you’d give up and decide you didn’t really need to hear the ding of an AOL instant message if it was going to interrupt your 128k Metallica MP3.

This functional­ity was actually disabled by default initially, so casual desktop users still relied on ESD and aRTs. Those and other systems then had to add support for ALSA, or use its OSS emulation layer. So much for progress. However, things improved, bugs were found and fixed, and mixing support was enabled by default in major distributi­ons. Profession­al musicians were able to use the JACKAudio

Connection­Kit ( JACK, started in 2001) and real-time kernel patches to seamlessly route audio between applicatio­ns. All of a sudden Linux became a serious platform for music production.

Not a perfect solution

But on the desktop, some gripes remained. ALSA’s dmix implementa­tion was a little hacky. It didn’t really enable multiple streams to access the hardware at the same time; rather, it allowed whoever got there first to share their access. In most cases this amounted to almost the same thing, but things broke down when, for example, multiple users tried to play things simultaneo­usly.

There was also only a single software volume control, so lacked per-applicatio­n volume control. In some cases, this could be worked around, but for others, especially playing network audio, the shortcomin­gs became apparent. Windows Vista, for all its resource-sapping widgets and abundant other flaws, did feature a whole new audio stack. Apple, likewise, had its CoreAudio stack, which like so many Apple things was simply magical.

Enter PulseAudio. It is perhaps a measure of how popular Linux (or perhaps just Ubuntu Linux) had become that PulseAudio drew so much criticism when it went mainstream. It started life as Polypaudio in 2004 and four years later found its way into Ubuntu 8.04. Intentions were honourable, but unfortunat­ely things did not go so smoothly – see the original ambitiousl­y titled mission statement at http://bit.ly/audio-mess.

PulseAudio is a sound server that sits on top of ALSA. It doesn’t touch the kernel at

all, and aimed to replace middle layers such as aRTs and ESD while at the same time maintainin­g compatibil­ity for them. It provided exciting new features such as network audio, timer-based (as opposed to interrupt) scheduling, on-the-fly switching of inputs and outputs, as well as per-stream volume controls. This brought parity with the recently released Windows 7.

The problem was, most Ubuntu users didn’t need (or didn’t think they needed) these features, and the previous ALSA/ ESD arrangemen­t worked perfectly well for them. Suddenly PulseAudio was on their systems, and audio was stuttering, distorting, or going out of sync during video playback, while certain applicatio­ns no longer worked at all. Forums became awash with disgruntle­d users wanting to banish this audio heathen. Also disgruntle­d users who had broken their systems by uninstalli­ng PulseAudio without due care and attention, taking with it all the applicatio­ns that depended on it, of which there were many. Complaints about

PulseAudio not working began to outnumber those concerning wireless.

There were certainly some bugs in the version of PulseAudio shipped with 8.04, but it’s not fair to plant all the blame there. There were more bugs in the Ubuntu implementa­tion. But more interestin­gly, the mass adoption uncovered bugs in that sprawling ALSA API we mentioned earlier. So complex was this beast that much of its functional­ity remained unused (and largely undocument­ed) until PulseAudio, and suddenly here was all manner of hardware and applicatio­ns hitting these hitherto unused features. These have all been fixed and anyone that’s ever used HDMI audio should be grateful: without PulseAudio’s advanced routing functional­ity this would be much harder. The latest PulseAudio release, 11, adds support for various Airplay and Bluetooth devices, and even includes support for GNU/Hurd.

It’s still possible to run a pure ALSA configurat­ion. This provides lower latencies and would be suitable on a constraine­d device where only a single audio applicatio­n is installed, for example, an old Raspberry Pi running MPD (although if you care about sound quality you really should add a DAC to it – sound quality through the Pi’s headphone jack is weak). PulseAudio is included in all major desktop distributi­ons and by now is definitely at the ‘just works’ stage for most desktop hardware. The only reason to avoid it is for profession­al audio work, where sub 10ms latencies are required. There’s really only one solution here (okay, technicall­y two) so check the box ( belowleft) if you don’t know JACK.

Totally wired

The latest developmen­t in the Linux multimedia sphere is the Gstreamer man Wim Taymans’ PipeWire. This ambitious project was originally titled PulseVideo, and yes, it aims to do for video what PulseAudio does for audio. We don’t mean that PipeWire makes you want to uninstall it by any means necessary and revert your video playing subsystems back to how they were, all the while cursing your distro for adopting this “rubbish”. Nay, PipeWire hopes to reduce fragmentat­ion and simplify existing frameworks for media playback. Codecs can be managed from a single place so there will be no more situations where a file will play in one applicatio­n but not in another.

PipeWire also wants to ease the transition toward Wayland and containeri­sed applicatio­ns (those packaged as Flatpaks, Snaps, Appimages). Anyone who’s dabbled with Wayland will be aware that it currently struggles with things like screenshot­s and screen recording (a shortcomin­g that also affects remote desktop software).

The reason why these don’t work is because Wayland isolates applicatio­ns in the same way as Flatpaks and the rest do, preventing them from seeing what other applicatio­ns are up to (including what the compositor is putting on the screen). By adding support for

PipeWire in the compositor this could be solved in a secure manner. Likewise, adding support in the SPICE protocol will benefit multimedia applicatio­ns running in VMs.

You can read more at Christian Schaller’s blogpost at http://bit.ly/

launching-pipewire. You can experiment with PipeWire in the recently released Fedora 27. Incidental­ly, this release allows hithero verboten audio formats such as MP3, AAC and AC3 to be played without the use of third-party repositori­es. In a world of streaming media and awesome open formats like FLAC and Ogg Vorbis, this is perhaps less relevant than it once was, but better late than never. LXF

“PulseAudio is definitely at the ‘just works’ stage for most desktop hardware”

 ??  ?? There’s a school of audio purism that says equalisati­on should only be applied by expensive hardware, but you can do it with PulseAudio, too.
There’s a school of audio purism that says equalisati­on should only be applied by expensive hardware, but you can do it with PulseAudio, too.
 ??  ?? Besides all the balkanised Linux audio systems that preceded it, PulseAudio also provided special dispensati­on for Flash, which was all-too common in the 2000s.
Besides all the balkanised Linux audio systems that preceded it, PulseAudio also provided special dispensati­on for Flash, which was all-too common in the 2000s.
 ??  ??
 ??  ?? PipeWire sounds sweet to us!
PipeWire sounds sweet to us!
 ??  ?? Making drum sounds is serious business and Hydrogen is a serious drum machine.
Making drum sounds is serious business and Hydrogen is a serious drum machine.

Newspapers in English

Newspapers from Australia