Low-level kernel access

John Schwartzman shows how to write assembly language code that calls Linux kernel services and the C run-time library.

2019-09-24 - John Schwartzman is a long-time engineering consultant to business and government. He also teaches computer science at a local college.

John Schwartzman shows how to write assembly language code that calls Linux kernel services and the C run-time library.

Last issue we used assembly language to access Linux kernel services. Now we’re going to use the C run-time library, glibc, instead of calling the kernel services directly. The glibc functions are in many cases thin wrappers around the Linux kernel services. This is the preferred way to access Linux kernel services.

Kernel system calls are limited to six arguments, but that’s not enough for the C library. We use almost the same six registers that we used for kernel system calls: RDI, RSI, RDX, RCX (instead of R10), R8 and R9, but any number of additional arguments can be passed to C library functions on the stack. We populate the registers listed above with the arguments to the function. We then PUSH the remaining arguments onto the stack and remove them from the stack after the C library function returns. You’ll see this in environment.asm.

When using the kernel system calls we called a common location using the software interrupt instruction SYSCALL and passed the ID of the specific service in the RAX register. When using the C library, we link to and call the specific function we want by name – though RAX still returns success or failure status to the caller.

ARRRGS!

Our next programs are cmdline.c (Figure 1, above right) and cmdline.asm (Figure 2, page 92). When a main function is invoked it has a few parameters that the user types on the command line. If you type ./cmdline alpha beta goldfish at the command prompt, Linux will execute the program cmdline . The program will receive as parameters, argc , which is the total number of string arguments (four in this case) followed by an array of pointers to the strings on the command line which are in an array of arrays called argv[] .

In this case, cmdline will receive as strings ./cmdline , alpha , beta and goldfish . Cmdline.c and cmdline.asm read and print argc and the argv[] strings. Since this is Linux, you can guess how we receive these parameters. RDI will have the integer argc (the first argument), and RSI will have the vector of pointers, argv .

Cmdline.c should be easy to understand. The prototype for main is: int main(int argc, char* argv[]) . After printing argc , we use a for loop to print each parameter index, i , followed by the string parameter argv[i] . That’s it. Execute ./a.out alpha beta goldfish . Now do the same thing in assembly language. Execute ./cmdline alplha beta goldfish .

At the beginning of cmdline.asm we define some constants. Some programmers are lazy and omit the constant declarations – they simply insert the appropriate numbers in the assembly code. The effect of this is to confuse the human readers of the program. These values look like ‘magic numbers’ when they’re just sprinkled into the code. We urge you to use LF , EOL , TAB and ARG_SIZE instead of 10, 0, 9 and 8. All programs should be self-documenting and a liberal use of constants improves the documentation. Highlevel languages are somewhat self-documenting, but assembly language needs a lot of documentation!

Our main function calls printf , so main is a caller of printf – but main itself is called by the C startup code, so main is also a callee. Therefore, main must save and restore any callee-saved registers that it uses. Notice that we PUSH R12, R13 and RBX at the beginning of main and then POP them in reverse order at the finish label, before main returns. Before that, however, we have some boilerplate code. We PUSH

RBP and then MOV RBP, RSP . This sets up the stack frame. The stack is 16-byte aligned at this point, but we’re about to push 3 * 8 = 24 bytes onto the stack and that would mean that it wouldn’t be 16-byte aligned. We compensate by doing the equivalent of an additional push – we subtract ARG_SIZE (8) from RSP. So, in effect, we’ve subtracted 32 bytes (2 * 16) from RSP, and it’s 16-byte aligned again.

If you look at the end of main , you’ll see the finish label at line 51. There we restore RBX, R13 and R12 by popping them off the stack. That’s fine, but our stack is now ‘off’ by eight bytes because of the subtraction we did in line 21 to keep it 16-byte aligned. No problem; we simply add eight bytes to RSP (line 55). The next piece of boilerplate code is the LEAVE op code at line 56. That moves RBP (which hasn’t changed, since we copied RSP into it at the beginning of main) into RSP and then pops RBP. LEAVE effectively undoes the PUSH RBP and MOV RBP, RSP that we started main with. Our stack is restored and the RET instruction in line 57 restores the processor’s instruction pointer (RIP) to the C startup code that invoked main .

I like to mov it

In main , we need to save our input parameters, so we MOV RDI (first parameter) into R12 and MOV RSI (second parameter) into R13. R12 and R13 are calleesaved registers, so that when we call printf we can be sure that printf won’t change the values of these registers. If it uses them, it guarantees that it will save and restore them, just as we have to do. We have a limited number of registers available, so we have to have rules about which ones can be altered and which must be saved across a function call. In line 36 we zero out the RBX register because we are going to use it as our index variable, i . In our for loop, i is initially zero. Why RBX? Because it’s another callee-saved register and printf must save and restore it for us.

These rules are set forth in the System 5 Application Binary Interface (ABI) AMD64 Architecture Processor Supplement. You know all about Application Programming Interfaces (API)? Well, the ABI defines the low-level binary interface between two or more pieces of software on a particular architecture. It defines how an application interacts with itself, how an application interacts with the kernel, and how an application interacts with libraries.

We print a blank line (line 30) and then get down to business. First, we load RDI (first argument to printf ) with the effective address of formatc . Then we load RSI (second argument to printf ) with argc . Then we call print which zeros RAX and calls printf . Thus we print the number of parameters argc .

RSI and R13 now point to the vector of strings. argv[0] is first and then eight bytes away is argv[1] and so on. So we enter a do...while loop at the getargvloop label in line 38.

We’re going to call printf whose prototype is int printf(const char* format, …) . printf uses varadic argument handling, which simply means that it can be called with a variable number of arguments.

The format argument is the ASCIIZ string shown in line 69: “argv[%d] = %s”, LF, EOL . The printf method uses the format string (its first parameter) to determine how many parameters it is being passed. There are two placeholders in the format string: %d which is a placeholder for an integer, and %s which is a placeholder for an ASCIIZ string. By counting the placeholders in the format string, printf can determine that it should look for two additional parameters. The format string must be followed by the parameters for which it has placeholders. So printf is getting a format string followed by a number, followed by a string.

How do you think those three arguments will be passed? If you guessed that RDI would contain the address of the ASCIIZ format string and the RSI (or rather the double-word ESI) would contain the integer and that RDX would contain the address of the string argument, you’d be correct.

ESI is the lower half of RSI. RSI is 64 bits wide, which is the size of a C long int in 64-bit C. ESI is 32 bits wide,

which is the size of a C int in 64-bit C. After printing, we increment RBX and compare it to R12, which you’ll remember is holding argc . Comparing is essentially a subtraction, except that we don’t save the result. We subtract R12 from RBX and if the CPU’S sign flag (SF) is set, then the CMP subtraction gave us a negative answer. (Flags are kept in the eflags register.) So if SF is set, we make the conditional jump back to the top of the loop and repeat the process until RBX is equal to R12, at which point SF is cleared in the eflags register and we’re finished.

What a debugger

Run this program in the debugger using ddd -x --args cmdline alpha beta goldfish and put it through its paces. Set a breakpoint on line 46 and hit Run. Observe the status of the sign flag after we perform CMP RBX, R12 . The sign flag is visible in the eflags register of the Registers Window. From the ddd menu select Status > Registers to see the Registers Window (see Figure 3,

above). The ddd debugger needs the prefix --args before cmdline in order to pass multiple arguments to the program.

So far, we’ve looked at assembly language programs that call the kernel directly, and assembly language programs that call the C run-time library (which calls the kernel). We’re now going to look at assembly language methods that are callable from C programs. Our next programs are environment.c and environment.asm. In these programs we interrogate and print several environment variables.

In environment.asm we start to make assembly language look like a high-level language by using the macro-assembler facility. In line 21 we declare the macro getsaveenv that takes one argument, %1 . The macro looks like this:

%macro getsaveenv 1 lea rdi, [env%1] call getenv lea rdi, [buf%1] mov rsi, rax mov rdx, BUFF_SIZE – 1 lea rcx, [nullline] cmp rax, ZERO cmovz rsi, rcx call strncpy

%endmacro

When we write getsaveenv HOME in line 45, what

we’re actually producing is: lea rdi, [ENVHOME] ; rdi contains the address of “HOME” call getenv ; invoke getenv lea rdi, [BUFHOME] ; rdi contains destination for HOME mov rsi, rax ; rax is the return value from getenv mov rdx, BUF_SIZE – 1 ; only write this much lea rcx, [nullline] ; in case getenv returned 0 in rax cmp rax, ZERO ; test whether rax contains 0 cmovz rsi, rcx ; if rax == 0, move rcx into rsi call strncpy ; invoke strncpy

You haven’t seen the conditional move instruction cmovz yet. What it does is mov if Zero (move if ZF is

set in the eflags register). We copy the address of the

(null) string into RCX and then test whether RAX contains 0. If it does, we mov RCX into RSI (the second argument to strncpy ). If it doesn’t, we leave RSI with the value returned by getenv . Using cmovz instead of

JZ saves quite a few instructions and labels.

This macro gets and saves a single environment variable. We repeat the operation 13 times and print all the results with a single call to printf . That’s not really very sensible. It would be much more straightforward to get, save and print one environment variable at a time – but then you wouldn’t see how to pass arguments on the stack. As an exercise, modify environment.asm to print one environment variable at a time.

Note how compact we’ve made this program – and how much typing we’ve saved and how many potential errors we’ve averted – by using a macro. In lines 61 to 78 we set up our arguments to printf . The first argument is formatstring which contains 14 string placeholders. We pass the address of formatstring in RDI. The next five arguments are placed in registers RSI, RDX, RCX, R8 and R9, just as we’d expect. Note that we PUSHED RDI, which is the single argument to

printenv which is the address of the ASCIIZ date string in line 39, but we POPED RSI in line 62. This turns out to be a convenient way to hang onto RDI until we need it, but when we do need it, we need it in RSI to be the second argument to printf .

We have now used up our six argument-passing registers and must pass the remaining five arguments on the stack. The question is how do we pass them. If you guessed “in reverse order”, you’d be correct. Remember, the stack grows downward. It ends up looking like this:

BUFHISFILE bufps1

BUFLANG

BUFMAIL

BUFEDITOR

BUFSHELL

BUFPATH

BUFTERM bufpwd <== top of stack

printf will grab the first six arguments from registers and then read the rest from the stack. Starting at the top of the stack, it reads bufpwd followed by BUSTERM and so on. Each push of a value onto the stack automatically subtracts eight bytes from the stack pointer register RSP. We call printf and then fix the stack with ADD RSP, NUM_PUSH * PUSH_SIZE . In other words, after pushing nine 8-byte addresses onto the stack, we ‘remove’ them from the stack by adding 8 * 9 = 72 bytes to RSP. It’s the caller’s responsibility to clean up the stack. You can see that these nine addresses are still on the stack, but as far as RSP is concerned, they don’t exist.

A dedede, a dododo

You can visualize what’s going on using the ddd debugger, but it doesn’t do a great job with macros. You have to open up the Machine Code Window to see the expanded macros, and use the Stepi and Nexti commands to step through your code. Another option is to have yasm produce a listing file. The makefile for environment produces environment.lst. This file shows the machine code on the left and the assembly on the right. The expansion of the getsaveenv HOME is shown here:

40 00000009 488D3C25[00000000] lea rdi, [ENVHOME]

41 %line 45+0 environment.asm 42 00000011 E8(F6FFFFFF) call getenv 43 00000016 488D3C25[00000000] lea rdi, [BUFHOME]

44 0000001E 4889C6 mov rsi, rax 45 00000021 48C7C27F000000 mov rdx, BUFF_ SIZE - 1

46 00000028 488D0C25[00000000] lea rcx, [nullline] 47 00000030 4883F800 cmp rax, ZERO 48 00000034 480F44F1 cmovz rsi, rcx 49 00000038 E8(F6FFFFFF) call strncpy

Running the program should make sense (see Figure

6, top right). We typed the command unset EDITOR before running environment so that you could see the effect of an unset environment variable. You might find several unset variables in your output. To set them, you have to export them inside one of your startup files. If you log in through a shell, add them to ~/.profile and log out and in again. If you log in graphically, add them to ~/.bashrc and close all terminals. Run environment

again and you should see your new entries added.

For our grand finale, we’re going to create two functions in assembly language that can be called by a C/C++ program. The function signatures are long printmax(long a, long b) and long printmin(long a, long b) . We’ll use the macro-assembler to generate our printmax and printmin functions and also to assign the local variables a and b.

Local variables (also called automatic variables) are created on the stack. We could do everything we need with registers, but we wanted you to see how easy and painless the macro-assembler makes using local variables. minmax.asm’s use of the macro-assembler

should give you insight into how high-level languages are built. Makefile builds minmax.c in two ways. As a standalone C program it invokes the compiler and linker with gcc -D c_version minmax.c . This makes the executable a.out. Makefile also invokes the compiler with gcc minmax.c minmax.obj -o minmax in order to compile minmax.c and to link it with minmax.obj, the assembler output for minmax.asm. This makes the executable minmax. Examine makefile and minmax.c to see how defining or not defining c_version changes minmax.c. This is called conditional compilation. Please refer to the article’s source code.

The function printmax contains the macros prologue , max and epilogue , and the function printmin reuses prologue and epilogue , but substitutes min for max . Expanding our macros,

printmax looks like this:

sub rsp, 16 ; make space on stack for 2 64-bit longs - prologue mov a, rdi ; move 1st param into local var a mov b, rsi ; move 2nd param into local var b mov rsi, a ; a is 2nd arg to printf mov rdx, b ; b is 3rd arg to printf mov rcx, rsi ; mov a into rcx cmp rcx, b ; is a < b? cmovb rcx, b ; if yes, move b into rcx - max lea rdi, [formatstrmax] ; formatstrmax is 1st arg to printf xor rax, rax ; zero rax for printf - epilogue push rcx ; rcx contains answer call printf ; invoke printf pop rax ; rax = answer add rsp, 16 ; remove a and b from stack ret ; return with answer in rax printmin looks almost the same; we simply substituted two lines. Now run minmax in the debugger: $ make debug

$ yasm -f elf64 -g dwarf2 -o minmax.obj -l minmax.lst minmax.asm

$ gcc -m64 -g -no-pie minmax.c minmax.obj -o minmax $ gcc -g -D c_version minmax.c

$ ddd -x --args minmax 16 32

Set a breakpoint at line 39 of minmax.c and click Run. Step into printmax by clicking the Step button. Open the Machine Code Window by selecting View > Machine Code Window on the menu. Then select Data > Memory from the menu. In the DDD: Examine Memory dialogue, select ‘Examine 2 decimal giants’ (64-bit longs) from $rsp and click the Display button. Next, click the Stepi button on the Command Tool. Click Stepi two more times to progress into printmax and your screen should look something like Figure 7 (below).

Notice that the first two parameters to printmax(a, b) are shown in rdi and rsi and also in the memory display at the top left of Figure 7, where you have moved them into local variables long a and long b . Notice where you wrote mov a, rdi and mov b, rsi in your code. The macro-assembler translated that to mov QWORD PTR [rsp], rdi and mov QWORD PTR [rsb+0x8], rsi . Is that not cool?

Continue to click Stepi to move through the assembler code. Note the bottom green arrow points to the processor’s instruction pointer (RIP). You might want to click Nexti for the call to printf or you’ll spend a lot of time wandering through glibc. Check the Execution Window after returning from printf to make sure minmax is behaving properly.

We hope you now have a new appreciation of what goes on in a high-level language. Thanks for reading, and have fun!

?? ?? Figure 1: cmdline.c. A C program that prints all of the arguments it receives on the command line. — Figure 1: cmdline.c. A C program that prints all of the arguments it receives on the command line.

?? ?? Figure 2: The beginning of cmdline.asm, the assembly language equivalent to cmdline.c. — Figure 2: The beginning of cmdline.asm, the assembly language equivalent to cmdline.c.

?? ?? Figure 3: Cmdline. asm during a debug session. The program is shown halted at a breakpoint in the DDD Debugger. — Figure 3: Cmdline. asm during a debug session. The program is shown halted at a breakpoint in the DDD Debugger.

?? ?? Figure 4: The environment.c program gets the current time as a string and then invokes the printenv function in environment.asm. — Figure 4: The environment.c program gets the current time as a string and then invokes the printenv function in environment.asm.

?? ?? Figure 5: The beginning of environment. asm. This program exports its printenv function. which is called by environment.c. — Figure 5: The beginning of environment. asm. This program exports its printenv function. which is called by environment.c.

?? ?? Figure 7: Minmax. asm is shown during a debug session. The program is shown halted at a breakpoint in the DDD Debugger. — Figure 7: Minmax. asm is shown during a debug session. The program is shown halted at a breakpoint in the DDD Debugger.

?? ?? Figure 6: The output of ./environment which shows the current time and the contents of several common environment variables. — Figure 6: The output of ./environment which shows the current time and the contents of several common environment variables.

Low-level kernel access

John Schwartzman shows how to write assembly language code that calls Linux kernel services and the C run-time library.

Newspapers in English

Newspapers from Australia

Low-level kernel access

John Schwartzma­n shows how to write assembly language code that calls Linux kernel services and the C run-time library.

Newspapers in English

Newspapers from Australia

John Schwartzman shows how to write assembly language code that calls Linux kernel services and the C run-time library.