Re: [BUG][2.5.66bk9+] - tty hangings - patches, dmesg & sysctl+T

Andrew Morton (akpm@digeo.com)
Tue, 8 Apr 2003 21:56:51 -0700


Roland Dreier <roland@topspin.com> wrote:
>
> Still, I like the idea of this patch, since it resolves the livelock.
> But I don't think the implementation is quite right. insert_sequence
> doesn't get incremented until delayed_work_timer_fn(). That means
> that a driver (tty_io.c, for example) could call
> schedule_delayed_work(), then call flush_scheduled_work() before
> delayed_work_timer_fn() has run for that work.

The driver needs to run cancel_delayed_work() before calling
flush_scheduled_work(). The tty patch is already doing that, and I think
that plugs the holes.

Here's a full changelog.

The workqueue code currently has a notion of a per-cpu queue being "busy".
flush_scheduled_work()'s responsibility is to wait for a queue to be not busy.

Problem is, flush_scheduled_work() can easily hang up.

- The workqueue is deemed "busy" when there are pending (timer-based)
works. But if someone repeatedly schedules new work in the delayed
callback, the queue will never fall idle, and flush_scheduled_work() will
not complete.

- If someone reschedules work (not delayed work) in the work function, that
too will cause the queue to never go idle, and flush_scheduled_work() will
not terminate.

So what this patch does is:

- Create a new "cancel_delayed_work()" which will try to kill off any
timer-based works.

- Change flush_scheduled_work() so that it is immune to people re-adding
work in the work callout handler.

We can do this by recognising that the caller does *not* want to wait
until the workqueue is "empty". The caller merely wants to wait until all
works which were pending at the time flush_scheduled_work() was called have
completed.

The patch uses a couple of sequence numbers for that.

So now, if someone wants to reliably remove delayed work they should do:

/*
* Make sure that my work-callback will no longer schedule new work
*/
my_driver_is_shutting_down = 1;

/*
* Kill off any pending delayed work
*/
cancel_delayed_work(&my_work);

/*
* OK, there will be no new works scheduled. But there may be one
* currently queued or in progress. So wait for that to complete.
*/
flush_scheduled_work();

And change the flush_workqueue() sleep to be uninterruptible. We worry that
delivery of a signal may cause the wait to return too early.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/