After much diagnosis, I have found the following sequence of events causes the
oops:
	- do_tty_hangup called for the TTY
		- ldisc.open() called for the TTY
		- ldisc.open blocks (probably waiting for memory)
	- tty_release called for the same TTY
		- release_dev() called
		- release_dev() completes
	- tty_release completes
I know that ldisc.open is blocking because I never see the message that
indicates it has completed.
Then the oops occurs:
	>>EIP; 205d3335 Before first symbol   <=====
	Trace; c01a6e20 <reset_buffer_flags+5c/64>
	Trace; c01a8632 <n_tty_open+82/b0>
	Trace; c01a4826 <do_tty_hangup+18a/294>
	Trace; c02161a0 <NR_TYPES+7c0/1620>
	Trace; c01131fe <tq_sched_kthread+52/70>
	Trace; c01fcf2a <tvecs+2d6/41ec>
	Trace; c011327e <tq_sched_kthread_start+62/6c>
	Trace; c01fcf40 <tvecs+2ec/41ec>
	Trace; c0106000 <get_options+0/78>
	Trace; c0108853 <kernel_thread+23/38>
Note that the tq_scheduler list is being run out of a kernel thread here instead
of being executed directly by schedule(); this was necessary to eliminate a
kernel panic.
Given this, how should this race condition be avoided?  It seems that the TTY
structure has a life outside its use in the file structure since hangups can
occur on the TTY at any time.  Would it make sense to break the strong
association between the TTY structures and the file handling?  Then, the file
interface could be a client of the TTY service, as would the driver when it
wants to propagate asynchronous events.
As such, each would then do its own "get" and "put" operations on the TTY
structures.
Also, is there a short-term solution that would prevent the problem?
-art
P.S. Please CC: me on replies.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/