Psutil for sysadmins.............
Mihalis Tsoukalos teaches you how to use the psutil module to examine your Linux system and access information on processes and usage.
Mihalis Tsoukalos explains how to use the cross-platform psutil module to examine your Linux system and become a power user and happy system administrator.
The main advantage of psutil, the Python module that provides information on system processes and use, is that it’s cross-platform, which means that the code will work without changes on all operating systems. After you read this tutorial and see the practical examples, you’ll know how to use the psutil module in your own projects, especially if you are a Linux system administrator or a developer of a complex application. In order to install psutil for Python 3 you’ll need to perform one of the following two actions: $ sudo apt-get install python3-psutil $ pip3 psutil
The following interaction with the Python 3 shell will verify that psutil is properly installed: >>> import psutil >>> print(psutil.__version__) 5.0.0
This means we’ll be using version 5.0.0 in this tutorial.
The basics of psutil
The most simple task you can perform with psutil is finding out the number of processes on your Linux machine: #!/usr/bin/env python3 import psutil nPlist = 0 for proc in psutil.process_iter():
nPlist = nPlist + 1 print(nPlist) print("Output format:”, proc)
The program uses the psutil.process_iter() method to iterate over the processes of your Linux system and count them. If you decide to execute simplePS.py, you will get the following output: $ ./simplePS.py 145 Output format: psutil.Process(pid=27841, name='apache2')
The first line from the output (above) is the number of processes whereas the second line shows that the psutil. process_iter() iterator returns objects that belong to the Process() class as defined in the psutil package. Please note that if you execute simplePS.py with root privileges you may get a higher number of processes. Also, remember that the Python 3 code of simplePS.py is portable and therefore can be executed on other Unix machines without any changes!
Uptime
This section will use the psutil module to emulate the job of the uptime command line utility. The important Python 3 code in myUptime.py is the following: bootTIME = psutil.boot_time()
NOW = time.time() s = NOW - bootTIME
So you should use the psutil.boot_time() method to get the time your Linux system booted and time.time() to get the current time in order to find the difference between them in seconds. Then you’ll have to convert that amount of time into days, hours and minutes. Executing myUptime.py generates the following output: $ ./myUptime.py Uptime: 93 days, 3 hours, 39 minutes and 38 seconds!
As you’ll probably appreciate, the use of the psutil module is mainly a process of using the right methods to extract the desired information.
Process ID intel
A core psutil task is how to get information about a given process when you know its process id. The good thing with psutil is that it gives you the desired information easily without having to deal with too many function calls provided that you know what you are looking for! Performing the same task using Linux command line utilities is more difficult. The Python 3 code of processID.py shows this: #!/usr/bin/env python3 import os import sys import psutil if len(sys.argv) == 2:
PID = int(sys.argv[1]) else: print('Usage:’, sys.argv[0], ‘processID') sys.exit(0) if psutil.pid_exists(PID):
print("Checking process with PID ", PID) else: print("Process does not exist!: ", PID) sys.exit(0) p = psutil.Process(PID) print("Parent process ID:”, p.parent()) print(p.status()) print(p.uids()) print(p.gids()) print(p.children()) It’s crucial to read the input as an integer, because the
psutil.Process() method expects to process an integer and not a string. Additionally, it’s important to make sure that the process ID you want to process actually exists, which is verified with the help of the psutil.pid_exists() method. Once again you can see the simplicity of psutil. However, the cost of this simplicity is having to make multiple method calls. Executing processID.py will generate the following kind of output: $ ./processID.py 7180 Checking process with PID 7180 Parent process ID: psutil.Process(pid=7179, name='sshd') sleeping puids(real=1000, effective=1000, saved=1000) pgids(real=1000, effective=1000, saved=1000) [<psutil.Process(pid=11731, name='python3') at 139829741027456>]
The second line displays the process ID of the parent process whereas the third line shows the status of the process. The fourth line shows the various user IDs of the process and the fifth line shows the various group IDs of the process. The last two lines are the process IDs of the children of the process! Bear in mind that the children of a process might not be alive after your script has finished its job! ( See topleftforanexcitingexampleoutputfromtheexecution ofthe processID.py file.)
Moving on to an equally vital are we will illustrate how to use psutil to get information about your mounted file systems. The psutil.disk_partitions() method returns all
mounted partitions, including information about the mount point of each entry, its device name and its mounting options.
Getting disk info
The psutil.disk_usage() method shows information regarding the disk usage of a given partition, which is given as an argument to it. Executing diskInfo.py on a Debian Linux machine with one mounted partition generates the following kind of output: $ ./diskInfo.py [sdiskpart(device='/dev/root’, mountpoint='/’, fstype='ext3’, opts='rw,noatime,errors=remount-ro,data=ordered')] sdiskusage(total=24769884160, used=16321609728, free=8196128768, percent=66.5)
Starting with the code of diskInfo.py, you can create a different version of the file that can tell you whether a given disk partition is mounted or not in case you have problems with a hard disk and want to look into it even more.
Network interfaces
The example code for this section, called netPS.py, will illustrate how to use psutil to get network-related information. The psutil.net_connections() method returns a list of network connections depending on the parameter passed to it. You can either choose to use no parameters at all or pass one of the following parameters to it: “inet”, “inet4”, “inet6”, “tcp”, “tcp4”, “tcp6”, “udp”, “udp4”, “udp6”, “unix” or “all”. If you’re not sure about the kind of network traffic you want, you should pass the “all” argument to psutil.net_connections().
Executing netPS.py using root privileges generates the following kind of output: [sconn(fd=-1, family=<AddressFamily.AF_UNIX: 1>, type=1, laddr='private/error’, raddr=None, status='NONE’, pid=None), ... If you try to execute netPS.py without root privileges, you might get the following kind of error message depending on your Linux setup and the privileges of the user account used for executing netPS.py: $ ./netPS.py ... psutil.AccessDenied: psutil.AccessDenied (pid=84595) ( Seethepreviouspagefordetailedoutputfrom netPS.py.)
The more output that you get from netPS.py, the more network connections your machine has. If your Linux machine is a web server then having lots of connections is perfectly acceptable; however, if you have a standalone client, then too many network connections is not likely to be a good thing and might flag up the fact that you need to check the security of your machine.
Logged in users
Let’s go on and now learn how to use psutil to display information about the logged in users of your Linux machine. The code of who.py is the following: #!/usr/bin/env python3 import psutil for user in psutil.users(): print(user.name, “at”, user.terminal) You just have to iterate over the return value of the psutil. users() which does all the dirty work. Executing who.py generates the following kind of informative output: $ ./who.py mtsouk at pts/0
As you will likely note, you can also see the terminal each user is connected to.
Examining Apache processes
Each process has a parent process that started it, which is a very common way of starting multiple instances of the same process. A very popular example is the Apache web server: when you start Apache, an apache process is started. This process automatically starts additional apache processes that are used for serving HTTP requests—the number of the apache processes that will be created is defined in the apache configuration file. The key point here is that without knowing the way Apache works, you can’t implement what you want to do, much in the same way as understanding psutil output.
In this section, we will show you how to use psutil, and some command line utilities, to find out the process id of the parent process that started all Apache children processes. The name of the Python script is apacheParent.py whereas the name of the Apache process is usually apache2. If you aren’t sure about the name of the Apache 2 executable, you can find it out pretty easily: $ ps ax | grep -i apache | grep -v grep | awk {'print $5'} | uniq /usr/sbin/apache2
The Python 3 script, called apacheParent.py, doesn’t require the use of the full path for the Apache 2 executable.
The approach that apacheParent.py uses is pretty simple, yet very effective: after finding out all processes with a given name, which is passed as a command-line argument to psutil, it finds out the process ID of the parent process of each one of them and prints it. In order to avoid dealing with lists and processing multiple variables in Python, which is possible but time-consuming, the rest of the processing will be done externally using traditional Unix command line tools. If there’s not a unique parent process ID, it means that there’s something wrong with your Apache 2 server process or that you are running multiple Apache 2 instances that listen to multiple TCP/IP ports.
The important code of apacheParent.py is the following: for proc in psutil.process_iter():
if proc.name() == nameP:
p = psutil.Process(proc.pid) print(p.parent().pid)
As you can see, by combining the capabilities of psutil with other Linux tools and using a bit of imagination you can help you develop great things.
An important warning: please don’t assume that the Apache 2 process with the smallest process ID will be the parent process of the other Apache 2 processes. The process ID values are being reset when their reach their maximum value and although you can change this maximum value, it’s usually pretty low. You can find the maximum process ID value of Linux machine as follows: $ cat /proc/sys/kernel/pid_max 32768 In order to execute ./apacheParent.py and get the results you want, you should do the following: $ ./apacheParent.py apache2 | sort | uniq -c | sort -rn 10 27111 11
The first line tells us that the process ID of the parent process is 27111 . The second line of the output indicates that the first Apache 2 process got started by /sbin/init, which has a process ID of 1 . The first column shows the number of times a process ID appears in the output, therefore indicating the total number of apache2 processes running on the Linux machine.
It’s worth bearing in mind that apacheParent.py also works with other programs that start many processes because the name of the process you are looking for is given as a command line argument to it, which can be anything that you want! ( SeeapacheParent.pyinaction,above.)
Plotting psutil data
The main benefit you get from using psutil is that you can easily combine the capabilities of Python 3 and its modules with the psutil module. In this last section we will illustrate how to get the output of psutil and create a plot using matplotlib,(or Octave, the open source alternative, see Tutorials, p76, LXF209) which is a powerful 2D plotting program! The important Python 3 code of CPUgraph.py is the following: for k in range(N): x.append(k) temp = psutil.cpu_percent(interval=2) y.append(temp) As you can see, you use the psutil.cpu_percent() method to get the current CPU usage, which you add to a list variable that’s named y . The number of the elements on the list is defined by the value you put in the range() function and is given as a command line argument to the program. Now you must wait until after you have collected all the desired data in order to plot it and save the output as a PNG file. Executing CPUgraph.py generates the reasonably pretty output ( like theexamplegraphthatyoucanseebelow).
Hopefully, you should now be persuaded by how useful the psutil module can be for advanced Linux users but especially for system administrators. Additionally, Python code that uses psutil can run as a cron job without any changes. As a result, adding psutil to your arsenal of tools will be very beneficial for you.