On the kernel side, yes.  In the userspace trampoline, it's not required.
> but once we start using sysexit for the signal handler return path too, we
> will need to restore %edx and %ecx too, otherwise our restarted system
> call will have crap in the registers. I already wrote the code, but
> decided that as long as we don't do that kind of restarting, we shouldn't
> have the overhead in the trampoline. But basically the trampoline then
> will become
> 
> 	system_call_trampoline:
> 		pushfl
> 		pushl %ecx
> 		pushl %edx
> 		pushl %ebp
> 		movl %esp,%ebp
> 		syscall
> 	0:
> 		movl %esp,%ebp
> 		movl 4(%ebp),%edx
> 		movl 8(%ebp),%ecx
> 		syscall
> 
> 	restart:
> 		jmp 0b
> 	sysenter_return_point:
> 		popl %ebp
> 		popl %edx
> 		popl %ecx
> 		popfl
> 		ret
> see? So you _have_ to really save the arguments anyway, because you cannot
> do a sysexit-based system call restart otherwise. And once you save them,
> you might as well restore them too.
> 
> And since you have to restore them for system call restart anyway, you
> might as well just make it part of the calling convention.
>
> Yes, I'm thinking ahead. Sue me.
You're optimising the _rare_ case.
The correct [;)] trampoline looks like this:
 	system_call_trampoline:
 		pushl %ebp
		movl %esp,%ebp
 		sysenter
 	sysenter_return_point:
		popl %ebp
 		ret
 	sysenter_restart:
		popl %edx
		popl %ecx
		movl %esp,%ebp
 		sysenter
This is accompanied by changing this line in arch/i386/kernel/signal.c:
	regs->eip -= 2;
To this (best moved to an inline function):
	if (likely (regs->eip == sysenter_return_point)) {
		unsigned long * esp = (unsigned long *) regs->esp - 8;
		if (!access_ok(VERIFY_WRITE, esp, 8)
		    || __put_user(regs->edx, esp)
		    || __put_user(regs->ecx, esp+4)) {
			if (sig == SIGSEGV)
				ka->sa.sa_handler = SIG_DFL;
			force_sig(SIGSEGV, current);
		}
		regs->esp = (long) esp;
		regs->eip = sysenter_restart;
	} else {
		regs->eip -= 2;
	}
Thus the common case, system calls, are optimised.  The uncommon case,
signal interrupts system call, is penalised, though it's not a large
penalty.  (Much less than the gain from using sysexit!)  Which is more
common?
By the way, this works with the AMD syscall instruction too.
Then the trampoline is:
	system_call_trampoline:
		pushl %ebp
		movl %ecx,%ebp
		syscall
	syscall_return_point:
		popl %ebp
		ret
	syscall_restart:
		popl %edx
		popl %ebp
		syscall
Finally, there is no need to restore %ebp in the kernel sysexit/sysret
paths, because it will always be restored in the trampoline.  So you
save 1 memory access there too.
(ps. I have a question: why does your trampoline save & restore
the flags?)
-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/