ptrace patch fails stress testing

linas@austin.ibm.com
Tue, 1 Apr 2003 12:22:56 -0600


Hi,

I've got a number of machines here that crash after installing
the recent ptrace fix. The crash only occurrs when machines
are highly stressed.

The problem appears to be that task->mm is dereferenced without
looking to see if mm is NULL. e.g. in the sched.h in the
is_dumpable() macro, we have task->mm->dumpable . I'm sitting
in front of a KDB session and I'm clearly looking at task->mm
which is NULL.

In my particular case, the crash is *always* in kernel/ptrace.c
in access_process_vm(), (which is called when something tries
to read /proc/pid/cmd_line). There seem to be a few other places
in the kernel where task->mm is dererenced without checking mm,
but these are rare (?) Most (?) places seem to make a point of
checking for NULL before using mm.

Why, how and under what conditions this race condition occurs,
I don't know. What the best fix is, I don't know.

I can try to just add a check for NULL, but I'd like someone
to tell me that 'yes this is the right way to fix this.'
(As opposed for trying to get some lock or trying to force
the process to get paged in or whatever.)

BTW, this is an SMP machine, don't know if that matters.

Comments? Suggestions?

--linas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/