force_successful_syscall_return() buggy?

Russell King (rmk@arm.linux.org.uk)
Sun, 15 Jun 2003 19:36:04 +0100


While looking at the new bits'n'pieces which has appeared in 2.5.71, I
noticed the following in alpha and ia64:

#define alpha_task_regs(task) \
((struct pt_regs *) ((long) (task)->thread_info + 2*PAGE_SIZE) - 1)

#define force_successful_syscall_return() (alpha_task_regs(current)->r0 = 0)

# define ia64_task_regs(t) (((struct pt_regs *) ((char *) (t) + IA64_STK_OFFSET)) - 1)

#define force_successful_syscall_return() \
do { \
ia64_task_regs(current)->r8 = 0; \
} while (0)

I don't know what happens on these architectures, but I have a suspicion
that there is a case which the above will fail, maybe with dramatic
consequences.

Consider what happens when a userspace program is started from kernel
space, eg the init(8) or hotplug programs. In these, we call execve()
from within kernel space function. This implies that we have some
frames already on the stack.

AFAIK, sys_execve() does not ensure that the kernel stack will be empty
before starting the user space thread, so these programs are running with
a slightly reduced kernel stack.

In turn, this means that the user registers are not stored at the top
of the kernel stack when the user space program subsequently calls a
kernel system call, which means the *_task_regs() macro doesn't point
at the saved user registers.

-- 
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/