Tuesday, April 21, 2015

process titles in Linux

FreeBSD provides a dedicated sysctl which updates an in-kernel buffer.

On Linux tools like ps read /proc/<pid>/cmdline in order to provide process titles/names + their args. This content is read from pages mapped into target process memory. People started abusing this situation by overwriting stored arguments in order to provide informative titles for their processes.

Updates have good performance since processes just write to their own memory.

As usual this leads to some user-visible caveats, which fortunately are fixable.

Title consistency

A cosmetic issue here is that there are no consistency guarantees - what happens if the kernel reads the content as its being written? The window is extremely small, and reads + updates are rare enough for this to likely never be a problem in practice.

As a side note, the kernel recognises the hack. People can even move the environment (which is normally stored after the argument vector) to make more space for the title and the kernel supports that.

One example approach would tell the kernel where to look for this data and would provide a marker to know whether there is an update in progress so that the kernel can re-read few times if needed.

Hanging processes

Accessing memory area storing cmdline's content requires locking target process address space for reading. Unfortunately it's possible that something will lock it for writing and block for an unspecified amount of time, preventing any read accesses. Then if you run a tool which reads the file (e.g. ps(1)), it blocks in an uninterruptible manner (i.e. cannot be killed) waiting for the lock.

So if you happen to have periodically executed scripts which run ps you can accumulate a lot of unkillable processes. This is especially confusing when you try to debug such a problem since mere ps run by hand will also block.

This can for example happen if nfs share dies while a process tries to mmap(2) a file backed by it and actual requests need to be issued.

I would say best course of action here would provide bound and killable sleep for cmdline reads. If no data could be read, process name taken from task_struct could be used.

No comments:

Post a Comment