Passed address may belong to kernelspace, so it has to be validated. But let's say we already did that.
Consider a toy syscall:
1 2 3 4 5 6 7 8 9 10 11 12 13 | int sys_meh(const char *name, int value) { if (!is_root()) { if (strcmp(name, "special") == 0) return -EPERM; } spin_lock(&meh_lock); meh_modify(name, value); spin_unlock(&meh_unlock); return 0; } |
Here we accept a name and a value, but only root is allowed to modify the object identified as special.
First access is at line 5. What if the passed address is garbage? The read will trigger a page fault and with no way to communicate the problem to strcmp, the kernel is forced to oops/panic.
So let's say the address is not garbage.
The name is read twice: by sys_meh itself and later by meh_modify. Or in other words, the code relies on the value not changing. Is the expectation met? No. For instance there can be a second thread which will try to modify the string after strcmp is done, but before meh_modify is called. This would in effect circumvent the protection we had in place.
Here the situation is even worse. By the time the code reaches meh_modify, the kernel could have decided to evict the page backing the string. On access a page fault will occur and the kernel will try to bring it in. But it took a spinlock, which means it is illegal to service a page fault due to deadlock potential.
In situations like this the standard way is to store relevant data in a temporary buffer.
This causes serious trouble when various security-oriented syscall wrappers were implemented. For instance, code trying to restrict file access by monitoring filenames had the exact same bug visible with sys_meh above (but it could be also circumvented in myriad of other ways, including symlinks). Interested parties are invited to read Exploiting Concurrency Vulnerabilities in System Call Wrappers.
No comments:
Post a Comment