Kernel Watch
Jon Masters summarises the latest happenings in the Linux kernel, so you don’t have to.
Linus Torvalds has announced the release of Linux 5.8. He’s said throughout the 5.8 release candidate kernels that 5.8 is a big release, yet the key standout is that there isn’t a key standout. While in the past, big releases have been marked by specific contributions – such as significant register descriptions for AMD GPUS in kernel 4.12 – 5.8 is “really all over the place”, with over 14,000 non-merge (summary) code commits and counting.
The first two weeks of a kernel development cycle are referred to as the “merge window”, when disruptive changes are allowed and new features land. The latter weeks of the cycle are for stabilisation and bug fixing. New features landing in 5.8 include a tweak to how the kernel handles swappiness, security fixes and many other enhancements. LWN has a good writeup at www.lwn.net.
Andy Lutomirski noted that support for FSGSBASE is queued up for Linux 5.9 and could do with some additional testing from the curious. The tests are in tools/testing/selftests/x86/fsgsbase_64.
Last Branch Records
Intel CPUS have long implemented support for tracking taken branches in running applications. A new set of patches aims to add support a newly Architectural LBR, meaning a standardised mechanism that will work across many generations without model-specific enablement being required. The aim of such code is to enhance the Linux perf command to track branch records generically, the results being useful for a variety of optimisation tricks, such as provided by AUTOFDO.
Modern programs are formed from millions of small sequences of CPU instructions interspersed every five instructions or so with a branch (jump) to some other piece of code. These branches happen in response to a conditional such as an if . It can be useful for profiling purposes to be able to understand which branches are being frequently taken to optimise application code.
Traditionally, profiling of an application would require instrumentation of the binary itself, and a user or developer would have to run this special version of the binary to collect profiling data that could be fed back into the compiler for optimising a final production build. Relatively few developers fully utilise this process due to the steps involved. There is another way, however, and that involves profiling an application with hardware help.
Last Branch Records are that help on x86. They collect the addresses of the last so many taken branches and store them into a buffer that can be read by the Linux perf tool. It can then generate the same kind of data as in the modified binary, but without the special build, and feed this into a compiler performing automated optimisation of an application at build time.
The patches include improved context switching via the XSAVES (Intel) CPU instruction, among many other additional improvements.