Re: Kernel panics on raw I/O stress test

Andrea Arcangeli (andrea@suse.de)
Fri, 20 Apr 2001 16:11:58 +0200


On Fri, Apr 20, 2001 at 08:44:35PM +0900, Takanori Kawano wrote:
>
> > Could you try again with 2.4.4pre4 plus the below patch?
> >
> > ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.4pre2/rawio-3
>
> I suppose that 2.4.4-pre4 + rawio-3 patch still has SMP-unsafe
> raw i/o code and can cause the same panic I reported.

I just fixed that as well last week and 2.4.4-pre4 + rawio-3 should be just SMP
safe, faster and my patch is racommended for integration.

> I think the following scenario is possible if there are 3 or more CPUs.
>
> (1) CPU0 enter rw_raw_dev()
> (2) CPU0 execute alloc_kiovec(1, &iobuf) // drivers/char/raw.c line 309
> (3) CPU0 enter brw_kiovec(rw, 1, &iobuf,..) // drivers/char/raw.c line 362
> (4) CPU0 enter __wait_on_buffer()

With my patch applied the kernel doesn't execute wait_on_buffer from wait_kio
here, it first executes kiobuf_wait_for_io and that is also a performance
optimization because kiobuf_wait_for_io will sleep only once and it will
get only 1 wakeup once the whole kiobuf I/O completed.

> (5) CPU0 execute run_task_queue() and wait
> while buffer_locked(bh) is true. // fs/buffer.c line 152-158
> (6) CPU1 enter end_buffer_io_kiobuf() with
> iobuf allocated at (2)
> (7) CPU1 execute unlock_buffer() // fs/buffer.c line 1994
> (8) CPU0 exit __wait_on_buffer()
> (9) CPU0 exit brw_kiovec(rw, 1, &iobuf,..)
> (10) CPU0 execute free_kiovec(1, &iobuf) // drivers/char/raw.c line 388
> (11) The task on CPU2 reused the area freed
> at (10).
> (12) CPU1 enter end_kio_request() and touch
> the corrupted iobuf, then panic.

The end_kio_request in CPU1 with my patch applied is executed before CPU0 can
execute wait_kio so it cannot race the above way.

Thanks for your comments (and yes you are right that the above race can happen
in all 2.4 kernels out there except in the aa latest ones).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/