Tuesday, April 7, 2015

nofile, ulimit -n, RLIMIT_NOFILE -- the most misunderstood resource limit

Have you ever seen "VFS: file-max limit XXX reached" and proceeded to hunt for file descriptor consumers? Does this prompt you to lsof -u joe | wc -l to find out how many file descriptors joe uses? If so, this post is for you.

Aforementioned message is not about file descriptors and lsof - u joe shows way more than just file descriptors anyway.
So what is limited by RLIMIT_NOFILE?

The biggest number that can be assigned to a file descriptor in given process.

I repeat: the biggest number that can be assigned to a file descriptor. Of course from the moment the limit is applied. This has a side effect of limiting the number of file descriptors a process can open from this point on.

1. process sets RLIMIT_NOFILE to 20 on its own. How many file descriptors can it have opened?

Impossible to tell. No new file descriptor will be bigger than 20, but there may be a huge number of already opened file descriptors with higher number.

2. There is only one process owned by joe. It has the following file descriptors opened: 0, 1, 2. It sets its own RLIMIT_NOFILE to 20 and creates a new process. How many file descriptors can be opened in each of them?

nofile limit is per process, thus the fact that one of these processes created the other one is irrelevant. Either can open 18 more file descriptors.

You may have encountered the following:
VFS: file-max limit $BIGNUM reached

What's the relationship between file descriptors and 'file' (struct file) limit?

So what is struct file? It is an object which contains some state related to an open entity like an on-disk file, pipe etc.

struct file may be used internally by the kernel and not be associated with any file descritor.
Each file descriptor has to be associated with exactly one struct file.
Each struct file has an unlimited number of associated file descriptors.
On clone() file descriptors are copied, i.e. they reference the same 'struct file' their counterparts in parent process do.

Opening a file typically boils down to the following:
lookup the file
allocate new file descriptor
allocate struct file
tie up an inode with struct file
set file descriptor to 'point' to struct file

3. Process has the following file descriptors open: 0, 1, 2. Now it calls clone(). How many new 'struct file' are allocated in order to satisfy this request?

None. 0, 1, 2 in the new process use struct file from 0, 1, 2 from the parent.

4. Process has the folowing file descriptors open: 0, 1, 2. Now it exits. How many 'struct file' will be freed as a result?

Impossible to tell. First, it is possible that all file descriptors were associated with the same 'struct file'. Not only that, these file descriptors could be inherited from parent process which is still alive and didn't modify its descriptors. As such, it is possible that struct file(s) in question are still in use.

5. No process has /etc/passwd open. Now one process opens it 3 times. How many 'struct file' were allocated as a result?

Three, one for each open request.

With that established let's take a look at related errors (man errno):

ENFILE          Too many open files in system (POSIX.1)
EMFILE          Too many open files (POSIX.1)

First one signals the kernel ran out of 'struct file', the other one that given process cannot have more file descriptors.

When the kernel prints "VFS: file-max limit XXX reached" it says it won't allocate any new struct file.

6. Let's assume the kernel reached the limit of 'struct file'. Now joe's process tried to obtain a new file descriptor. Can this operation succeed? Which error is returned on failure?

If the new file descriptor would have new 'struct file', the error would be ENFILE.
But it may be that this file descriptors is going to reuse already existing 'struct file', in which case it does not matter that the kernel hit the limit. In can fail with EMFILE, or it can succeed, depending on rlimits.

lsof | wc -l vs number of open descriptors

Apart from file descriptors, lsof shows other stuff (e.g. in-memory file mappings, current working directory). As such, output from mere 'lsof' invocations cannot be used to check file descriptors.

7. An administrator does `lsof -p 8888 | wc -l` and receives 10000. How many file descriptors are in use this process?

As noted earlier, impossible to tell due to other fields printed by lsof.

Current amount of open file descriptors by given process can be obtained by counting symlinks in /proc/<pid>/fd.

8. `ls /proc/<pid>/fd | wc -l`  returns 9000. How many 'struct file' are in use by this process?

Anything between 1 and 9000 (including both).

9. We get result as previous one. What can you say about 'nofile' rlimit set on this process?

Nothing. Not only we don't know the biggest open fd, even if we did the fd could be open before the limit was applied.

No comments:

Post a Comment