Re: highmem deadlock fix [was Re: VM in 2.4.10(+tweaks) vs. 2.4.9-ac14/15(+stuff)]

Andrea Arcangeli (andrea@suse.de)
Fri, 28 Sep 2001 01:47:20 +0200


On Thu, Sep 27, 2001 at 04:16:11PM -0700, Linus Torvalds wrote:
>
> On Fri, 28 Sep 2001, Andrea Arcangeli wrote:
> However, your patch is racy:
>
> > --- 2.4.10aa2/fs/buffer.c.~1~ Wed Sep 26 18:45:29 2001
> > +++ 2.4.10aa2/fs/buffer.c Fri Sep 28 00:04:44 2001
> > @@ -194,6 +194,7 @@
> > struct buffer_head * bh = *array++;
> > bh->b_end_io = end_buffer_io_sync;
> > submit_bh(WRITE, bh);
> > + clear_bit(BH_Pending_IO, &bh->b_state);
>
> No way can we clear the bit here, because the submit_bh() may have caused
> the buffer to be unlocked and IO to have completed, and it is no longer
> "owned" by us - somebody else might have started IO on it and we'd be
> clearing the bit for the wrong user.

Moving clear_bit just above submit_bh will fix it (please Robert make
this change before testing it), because if we block in submit_bh in the
bounce, then we won't deadlock on ourself because of the pagehighmem
check, and all previous non-pending bh are ok too, (only the next are
problematic, and they're still marked pending_IO so we can't deadlock on
them).

So you can re-consider my approch, the design of the fix was ok, it was
just a silly implementation error.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/