Sensors and sensibility
Recalling a past life where he gazed on dials and Galilean thermometers, Jonni Bidwell assesses Linux’s system monitoring offerings.
Remembering a past life where he gazed on dials and Galilean thermometers, Jonni Bidwell examines Linux’s system monitoring offerings.
We don’t think that governments or tech giants have good reason to be monitoring every aspect of our lives. But we do think it’s well within your remit to monitor all the minutia of your Linux machine.
CPU load, fan speeds, temperatures, network traffic and disk space are just some of the things that you can keep an eye on. And when it’s presented in delightful graph form it’s a pleasure to do so.
There’s good reason to keep an eye on this data. If you notice your machine slowing down it could be because it’s run out of memory, or because a task is hogging the CPU. These are easy to check for from the command line or from a graphical system monitor. But the culprit could be overheating, which you could check for by prodding the heatsinks on your CPU and graphics card, but we don’t recommend that. Instead, we’ll show you how to read thermal information from the command line and even capture it while you’re playing your favourite Gpu-intensive shooting game and then graph it with Gnuplot.
Modern graphics cards have all kinds of sensors on them and Windows users have long enjoyed friendly GUI applications to monitor and meddle with voltages and frequencies. The open source AMDGPU driver avails the kernel of all this data, and Nvidia’s
System Management Interface (SMI) enables users of such hardware do so from the command line. But for AMD users there’s the superb Corectrl, which can monitor everything in real time. Not only that, you can use it to fine-tune your GPU settings for maximum performance or optimal power efficiency.
There are a few command line tools everyone should know about, and one of them is Top. It comes free with any Linux distro you’d care to name. Open up a terminal and type top to see a list of running processes ordered by CPU usage, updated every couple of seconds.
The first line shows system uptime and load information (the last three numbers are how busy your CPU has been over the last one, five and 15 minutes). The next line shows how CPU time has been spent on system and user tasks, and how long it has been idle. Next we see how long it has been waiting on I/O operations and servicing hardware interrupts. The MIB Mem row shows memory and swap space usage. Free memory is essentially wasted memory, so programs will use as much as they can for buffers and cache, which you’ll see in the last column. This isn’t really hogging memory, since it can be readily relinquished. So if it makes you feel better you can add this figure to the free memory displayed.
Get on top of things
The VIRT, RES, and SHR columns in the process list show the total amount of memory used, resident size and shared memory (in kilobytes) of each program. The VIRT column includes the forspoken of buffers and caches, and it’s not really fair to say Gnome Shell is a memory hog just because its value here is in the gigabytes. If a program is mercilessly taxing your CPU, it will remain at the top of the list. If that program isn’t something you want running – for example, an unresponsive application or a crashed browser process – then we can use one of Top’s lesser-known features to kill it. Just press k and enter a process ID (or press Enter to select the most active one), and press Enter again to send it the TERM signal (15), which will ask it nicely but firmly to close. If it doesn’t do so, do the same thing again, but send the KILL signal (9).
To list processes by memory usage, press M. Top is case sensitive, and if you press m you’ll toggle the memory summary at the top. It can display a rather fetching bar chart, and if you like the sound of that you should check out Htop, which you can install with:
$ sudo apt install htop
Htop is an modern re-imagining of Top, with clickable and scrollable output. There’s a range of Top alternatives
(including the Rust programs Ytop and Bottom, the Node.js-written gtop or Bpytop, which we’ll look at soon). If you just want a graphical tool to keep an eye on your desktop machine’s resource usage, then try Gnome’s System Monitor or KDE’S Ksysguard. As well as process info, they both show network traffic, and Gnome’s offering even shows disk read and write statistics. There are all kinds of plugins, applets and other desktop add-ons for displaying sensor data. But they all rely on the LM (Linux Monitoring) Sensors suite.
LM Sensors enables you to read temperatures, fan speeds, voltages and the like from the kernel, and
Psensor is a GUI utility for graphing them. You can install both with a simple
$ sudo apt install lm-sensors psensor then read sensor data with an even more simple:
$ sensors
You should see at least some readings from at least some adapters (PCI devices or ACPI interfaces with sensors). But you can perhaps get more if you run
sudo sensors-detect that will probe your system for additional adapters. It’ll warn you about potentially unsafe probes and enable you to skip them.
Special care should be taken with laptops here (if it crashes then make sure to say No next time). Typically, this will find extra sensors on the motherboard’s I2C or ISA buses (the it87 chip is popular). Readers of a certain vintage will remember ISA cards, and may be pleased to hear that the bus lives on, even if the slots do not.
Once the probes are complete the program will tell you which modules should be loaded to read what sensors, and even offer to have your system autoload them. Most Linux distributions will load things like coretemp and k*temp (for Intel and AMD CPU temperatures) automatically, and any sensor-bearing PCI devices you have (as long as they’re working) already pipe this to the kernel. So the list of drivers produced by the previous command might be short (for example, on a modern laptop) or might contain modules that were already loaded. To check, open another terminal and run sensors again and see if more sensor data is available. If so, take it up on its offer to update /etc/modules.
We can use another useful Linux command to constantly refresh the screen with the output from LM Sensors. Try running: $ watch sensors for live updates. Alternatively, launch Psensor to draw CPU temperature, fan speed and voltage graphs. You’ll want to go to the Sensor Preferences and come up with friendlier names than temp1 etc.
You’ll also discover that there are plenty of readings you don’t care about (apparently our laptop has six different thermal sensors) so uncheck them to unclutter your graph. Psensor can also monitor hard drive temperature via Hddtemp and Libatasmart, but this only works for rotational drives, not solid state ones.
Colourful information
Our new favourite way of graphing sensor data is
BPYTOP. It’s a 256-colour terminal affair, with graphs, charts, gauges, a kitchensink and you can install it via
Pip and run it with the following:
$ sudo apt install python3-pip
$ sudo pip install bpytop –upgrade
$ bpyinfo
The utility is incredibly configurable: use the number keys to toggle the visible panes and scroll through the process list with the cursors, and hit Enter to see more information about the highlighted process (or to kill it). We recommend checking BPYTOP’S project page for more details on this.
If you have an AMD graphics card (and it’s new enough to use the AMDGPU driver), then you probably saw some temperature and fan speed readings in the sensors command. If you’re using the proprietary Nvidia driver, you can use the nvidia-smi program to obtain
similar data. Psensor can chart this information too.
However, we wanted to showcase another outstanding utility: Corectrl. This is an Amd-only affair, but can monitor both CPUS and GPUS, and furthermore can do so on a per-application basis. The AMDGPU driver also supports overclocking, but you’ll need to add a kernel argument to unlock it and the syntax for setting power profiles is pretty tricky, and not the sort of thing you want to get wrong when your precious hardware is at stake. But Corectrl makes it easy, at least once you’ve added the magic boot parameter:
amdgpu.ppfeaturemask=0xfffffffff which we’re deliberately not going to tell you how to use because we don’t want to be responsible for you cooking your GPU. Time to do your own research!
Opinions differ on overclocking. When Linus Torvalds upgraded to a mighty 32-core Threadripper 3970X in May 2020 he said (in a ZDNET interview at http://bit.ly/lxf274linus) that when choosing a new PC he “looks for something that’s good for overclocking” but then categorically does not overclock it. Indeed, some people like to underclock things. Cryptocurrency miners will often dial back the grunt to get a better dollars:watts ratio, and the AMDGPU driver (and Corectrl) supports power limiting too.
If you have problems with gaming, whether it’s crashing, stuttering or just being rubbish at them, it’s worth keeping an eye on GPU temperatures while you’re playing. Except that’s easier said than done. Linux gaming may have come a long way, but many titles don’t like being Alt-tabbed out of (or won’t let you Alttab back in) so that you can check your desktop sensor program. But we can solve this with the magic of scripts. The following bit of rudimentary Bash will extract the GPU temperature every five seconds and put it, along with the current UNIX time, in a CSV file. You’ll need to tweak the grep and cut arguments to match your system.
#!/bin/bash
while : do
temp=$(sensors -u | grep -m1 temp1_input | cut -c16-23) time=$(date +%s) echo $time,$temp >> gputemp.csv sleep 5 done
Save it as gpumonitor.sh, make it executable and run it before launching whatever game is causing you issues or suspicion. Even if the game crashes, the CSV file will remain intact, and you can inspect the last few entries. But we mentioned (with worrying multiplicity – Ed) graphs, and Gnuplot (which is just an Apt install away) will happily graph the data from this file. Gnuplot will happily translate the timestamps (seconds since 1 Jan 1970) into human readable dates too – we just need to tell it how. And you can do that by making a file called gputemp.plot and filling it with:
set datafile separator ‘,’ set title “GPU Temp” set yrange [20:100] set xdata time set timefmt “%s” plot “gputemp.csv” using 1:2 with lines axes x1y1
Then, making sure you stopped the monitoring script with Ctrl-c, run gnuplot gputemp.plot and you should see a chart. Of course, there’s all kinds of options to make it a better chart, but this gets the job done minimally, which is good as we’re fresh out of space. Please do show us your beautiful graphs though!