non-atomic test victim #1

Robert Love (rml@tech9.net)
17 Sep 2002 16:33:33 -0400


Current kernel with the non-atomic test I sent, SMP with preemption.

I only get two triggers, both during boot. They are on migration_thread
and ksoftirqd startup. The problem lies in set_cpus_allowed():

Trace; c0116ac4 <schedule+a4/4b0>
Trace; c011731a <wait_for_completion+11a/1d0>
Trace; c0116f30 <default_wake_function+0/40>
Trace; c0115d02 <try_to_wake_up+312/320>
Trace; c0116f30 <default_wake_function+0/40>
Trace; c0118f0f <set_cpus_allowed+22f/250>
Trace; c0118f7d <migration_thread+4d/590>
Trace; c0118f30 <migration_thread+0/590>
Trace; c0118f30 <migration_thread+0/590>
Trace; c010586d <kernel_thread_helper+5/18>

It is obviously the preempt_disable() which we hold past the wake_up().

The issue is that, without this preempt_disable() there have been
observed crashes, especially on large n-way machines. Both Andrew
Morton and Anton Blanchard have reported the problem and that this fixes
it.

Question is, why does set_cpus_allowed() need it? I do not see it... it
must be an issue with an early preemption and the resulting
migration_thread?

Ingo, ideas?

Robert Love

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/