Let's reiterate, consider:
int val;
val = *some_userspace_pointer;
printf("%d\n", val);
If some_userspace_pointer e.g. contains garbage, a page fault is going to occur. The page fault handler will conclude the fault cannot be satisified. But there is no way to tell this code about this issue - it only reads the value and assumes it succeeded.
What's needed is a function which will be able to actually detect the condition and return an error to the caller. With such a primitive in place the code becomes:
int val, error;A super slow variant would lock the address space, ensure relevant mappings are fine and only then do the read. That's a lot of of work completely unnecessary in the common case.
error = copyin(some_userspace_pointer, &val, sizeof(val));
if (error != 0)
return error;
printf("%d\n", val);
Instead, the standard approach is to have a way to tell the page fault handler where to jump if the page fault cannot be serviced. The place is supposed to clean up after failed copy and go back to the original caller.
In pseudo-code it would look like this:
int
copyin(void *from, void *to, size_t len)
{
set_fault_handler(copyin_fault);
if (len == 0)
goto done_copyin;
if (!fits_userspace(from, len))
goto copyin_fault;
memcpy(to, from, len);
done_copyin:
set_fault_handler(0);
return 0;
copyin_fault:
set_fault_handler(0);
return EFAULT;
}
Let's take a look at an actual implementation with straightforward assembly (copyin(9) from the FreeBSD tree):
/*
* copyin(from_user, to_kernel, len) - MP SAFE
* %rdi, %rsi, %rdx
*/
ENTRY(copyin)
PUSH_FRAME_POINTER
movq PCPU(CURPCB),%rax
movq $copyin_fault,PCB_ONFAULT(%rax)
The handler is first set...
testq %rdx,%rdx /* anything to do? */
jz done_copyin
/*
* make sure address is valid
*/
movq %rdi,%rax
addq %rdx,%rax
jc copyin_fault
movq $VM_MAXUSER_ADDRESS,%rcx
cmpq %rcx,%rax
ja copyin_fault
... the range is then validated ...
... and finally the copy actually done. In an event of a page fault which cannot be satisified, the kernel will go to copyin_fault label which will unset the handler and return an error effectively cleaning up after the function. The target buffer may now contain partially copied data, but that's an acceptable state - if the syscall failed, buffer content is not specified. Finally, if a page fault could be serviced without an issue (e.g. a page was swapped in) or there were no page faults, copying finishes and the code falls below to unset the handler and return 0.
xchgq %rdi,%rsi
movq %rdx,%rcx
movb %cl,%al
shrq $3,%rcx /* copy longword-wise */
cld
rep
movsq
movb %al,%cl
andb $7,%cl /* copy remaining bytes */
rep
movsb
done_copyin:
xorl %eax,%eax
movq PCPU(CURPCB),%rdx
movq %rax,PCB_ONFAULT(%rdx)
POP_FRAME_POINTER
ret
ALIGN_TEXT
copyin_fault:
movq PCPU(CURPCB),%rdx
movq $0,PCB_ONFAULT(%rdx)
movq $EFAULT,%rax
POP_FRAME_POINTER
ret
END(copyin)
No comments:
Post a Comment