our_ioctl
function and add a couple of declarations to our source file. By the end of this article we will be able to intercept any system call in our system should there be a need for that.
System Call Table
System Call table is simply an area in the kernel memory space that contains addresses of system call handlers. Actually, a system call number is an offset into that table. This means that when we call sys_write
(to be more precise - when libc calls sys_write
) on a 32-bit system and passes number 4 in EAX
register before int 0x80
, it simply tells the kernel to go to the system call table, get the value at offset 4 from the system call table's address and call the function that address points to. It may be number 1 in RAX
in case of a 64-bit system (and syscall
instead of int 0x80
). System call numbers are defined in arch/x86/include/asm/unistd_32.h
and arch/x86/include/asm/unistd_64.h
for 32 and 64-bit platforms respectively. In this article, we are going to deal with sys_open
system call which is number 5 for 32-bit systems and number 2 for 64-bit systems.
Due to the fact, that modern kernels do not export the sys_call_table
symbol any more, we will have to find its location in memory ourselves. There are some "hackish" ways of finding the location of the sys_call_table
programmatically, but the problem is that they may work, but may not work as well. Especially the way they are written. Therefore, we are going to use the simplest and the safest way - read its location from /boot/System.map
file. For simplicity reasons, we will just use grep and hardcode the address. On my computer, the command grep "sys_call_table" /boot/System.map
(you should check the file name on your system, as on mine it is /boot/System.map-2.6.38-11-generic
) gives this output "ffffffff816002e0 R sys_call_table
". Add global variable unsigned long *sys_call_table = (unsigned long*)0xYour_Address_Of_Sys_call_table
.
Preparations
We will start, as usual, by adding new includes to our code. This time, those include files are:
#include <linux/highmem.h> #include <asm/unistd.h>
The first one is needed due to the fact that system call table is located in read only memory area in modern kernels and we will have to modify the protection attributes of the memory page containing the address of the system call that we want to intercept. The second one is self explanatory after the previous paragraph. We are not going to use hardcoded values for system calls, instead, we will use the values defined in unistd.h
header.
Now we define two values, which would be used as cmd
argument to our_ioctl
function. One will tell us to patch the table, another one will tell us to fix it by restoring the original value.
/* IOCTL commands */ #define IOCTL_PATCH_TABLE 0x00000001 #define IOCTL_FIX_table 0x00000004
Add one more global variable int is_set=0
which will be used as flag telling whether the real (0) or custom(1) system call is in use.
It is important to save the address of the original sys_open
as we are not going to fully implement our own, instead, our function will log information about the call arguments and then perform the actual (original) call. Therefore, we define a function pointer (for original call) and a function (for custom call):
/* Pointer to the original sys_open */ asmlinkage int (*real_open)(const char* __user, int, int); /* Our replacement */ asmlinkage int custom_open(const char* __user file_name, int flags, int mode) { printk("interceptor: open(\"%s\", %X, %X)\n", file_name, flags, mode); return real_open(file_name, flags, mode); }
You have noticed the "asmlinkage
" attribute. Well, it is, actually, a define for the attribute. We will not go that deep this time, I will just say that this attribute tells the compiler about how it should pass arguments to the function, given that it is being called from an assembly code. The "__user
" macro, signifies that the argument is in user space and the function must perform certain operations to copy it to kernel space when needed. We do not need that, meaning that we may ignore it for now.
Another couple of crucial functions is the set that will allow us modify the memory page protection attributes directly. One may say that his is risky, but, in my opinion, this is less risky then actually patching the system call table as it is, first of all, architecture dependent and we know that architectures do not change drastically, second - we use kernel functions for that.
/* Make the page writable */ int make_rw(unsigned long address) { unsigned int level; pte_t *pte = lookup_address(address, &level); if(pte->pte &~ _PAGE_RW) pte->pte |= _PAGE_RW; return 0; } /* Make the page write protected */ int make_ro(unsinged long address) { unsigned int level; pte_t *pte = lookup_address(address, &level); pte->pte = pte->pte &~ _PAGE_RW; return 0; }
pte_t
stands for typedef struct { unsigned long pte } pte_t
and represents the page table entry
. Although, it is simply an unsigned long
, it is declared as struct in order to avoid type misuse.
pte_t *lookup_address(unsigned long address, unsigned int *level)
is provided by the kernel and performs all the dirty work for us and returns a pointer to the page table entry
that describes the page
containing the address
. This function accepts the following arguments:
address
- an address in virtual memory;
level
- pointer to unsigned integer value which accepts the level of the mapping.
Let's Get to Business
We are almost there. The only thing left is the actual implementation of the our_ioctl
function. Add the following lines:
switch(cmd) { case IOCTL_PATCH_TABLE: make_rw((unsigned long)sys_call_table); real_open = (void*)*(sys_call_table + __NR_open); *(sys_call_table + __NR_open) = (unsigned long)custom_open; make_ro((unsigned long)sys_call_table); is_set=1; break; case IOCTL_FIX_TABLE: make_rw((unsigned long)sys_call_table); *(sys_call_table + __NR_open) = (unsigned long)real_open; make_ro((unsigned long)sys_call_table); is_set=0; break; default: printk("Ooops....\n"); break; }
And these lines to the cleanup_module
function:
if(is_set) { make_rw((unsigned long)sys_call_table); *(sys_call_table + __NR_open) = (unsigned long)real_open; make_ro((unsigned long)sys_call_table); }
Our interceptor module is ready. Well, almost ready as we need to compile it. Do that as usual - make
.
Test
Finally, we have our module set and ready to use, but we have to create a "client" application, the code that will "talk" to our module and tell it what to do. Fortunately, this is much simpler then the rest of the work, that we have done here. Create a new source file and enter the following lines:
#include <stdio.h> #include <sys/ioctl.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> /* Define ioctl commands */ #define IOCTL_PATCH_TABLE 0x00000001 #define IOCTL_FIX_TABLE 0x00000004 int main(void) { int device = open("/dev/interceptor", O_RDWR); ioctl(device, IOCTL_PATCH_TABLE); sleep(5); ioctl(device, IOCTL_FIX_TABLE); close(device); return 0; }
save it as manager.c and compile it with gcc -o manager manager.c
.
Load the module, run ./manager
and then unload the module when manager exits. If you issue the dmesg | tail
command. If you see lines containing "interceptor: open(blah blah blah)
", then you know that those lines were produced by our handler.
Now we are able to intercept system calls in modern kernels despite the fact that sys_call_table
is no longer exported. Although, we deal with low level structures, which normally are only used by kernel, this still is a relatively safe method as long as your module is compiled against the running kernel.
Hope this post was helpful. See you at the next one!