Sunday, February 14, 2016

Fun fact: transient failure to read process name on Linux

Process names are obtained by tools like ps by reading /proc/<pid>/cmdline. The content of the file is obtained by accessing target process's address space. But the information is temporarily unavailable during execve.

In particular, a new structure describing the address space is allocated. It is being assigned to the process late in execve stage, but before it is fully populated. The code generating cmdline detects the condition and returns value of 0, meaning no data was generated.

Consider execve-loop, doing execs:
#include <unistd.h>          

main(int argc, char **argv)

    execv(argv[0], argv);

And execve-read, doing reads:
#include <sys/types.h>
#include <sys/stat.h>
#include <err.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>

main(int argc, char **argv)
    char buf[100];
    char *path;
    int fd;

    if (argc != 2)
        return (1);

    path = argv[1];

    for (;;) {
        fd = open(path, O_RDONLY);
        if (fd == -1)
            err(1, "open");
        if (read(fd, buf, sizeof(buf)) == 0)
            printf("success: [%s]\n", buf);

Let's run them:

shell1$ ./execve-loop
shell2$ ./execve-read /proc/$(pgrep execve-loop)/cmdline
success: [./execve-loop]
success: [./execve-loop]
success: [./execve-loop]


Could the kernel be modified to e.g. provide the old name or in worst case wait until the new name becomes available? Yes, but this does not seem to be worth it.

Tuesday, February 9, 2016

kernel game #3

Assume we have an extremely buggy driver. Multiple threads can call into meh_ioctl shown below at the same time with the same device and there is no locking provided. The routine is supposed to either store a pointer to a referenced struct file object in m->fp or just clear the entry (and of course get rid of the reference).

What can go wrong here? Consider both a singlethraded and multithreaded execution.

int meh_ioctl(dev_t dev, ioctl_t ioct, int data)
        meh_t m *m = to_meh(dev);
        struct file *fp;

        switch (ioct) {
        case MEH_ATTACH:
                /* data is the fd we are going to borrow the file from */

                /* check if we already have a reference to a file */
                if (m->fp != NULL)
                /* fget return the file with a reference or NULL on error */
                fp = fget(data);
                if (fp == NULL)
                        return EBADF:
                m->fp = fp;
        case MEH_DETACH:
                if (m->fp == NULL)
                        return EINVAL;
                m->fp = NULL;

        return 0;

Monday, February 8, 2016

kernel game #2

Consider a kernel where processes are represented with struct proc objects. The kernel implements unix-like interfaces and works with multiple CPUs.

The following syscall is provided:

int sys_fork(void)
        struct proc *p;

        int error;

        error = kern_fork(&p);
        if (error == 0)
                curthread->retval = p->pid;
        return error;

That is, if error is 0 we know forking succeeded. in which case the functions stores the pid found in the object. Otherwise non-zero error value is returned and the retval field is not inspected.

Why would this code be incorrect?