Re: One more bio for for floppy users in 2.5.33..

Andrew Morton (akpm@zip.com.au)
Thu, 05 Sep 2002 09:26:28 -0700

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Chris Friesen: "Re: ARP and alias IPs"
Previous message: Oleg Drokin: "Re: [reiserfs-dev] Re: [PATCH] sparc32: wrong type of nlink_t"
Maybe in reply to: Linus Torvalds: "One more bio for for floppy users in 2.5.33.."
Next in thread: Andrew Morton: "Re: One more bio for for floppy users in 2.5.33.."

Linus Torvalds wrote:
>
> ...
> I would suggest:
>
> - add a "nr of sectors completed" argument to the "bi_end_io()" function,
> so that it looks like
>
> void xxx_bio_end_io(struct bio *bio, unsigned long completed)
> {
> /*
> * Old completion handlers that don't understand it
> * should just return immediately for partial bio
> * completion notifiers
> */
> if (bio->b_size)
> return;
> ...
> }
>
> which would allow things like mpage_end_io_read() to unlock pages as
> they complete, instead of unlocking them all in one go.

It's a feature! We don't want to have to soak up 20,000 context
switches a second just reading a file from an 80MB/sec disk.

> Comments? It looks trivial to do this from a bio level, and it would not
> be hard to update the existing end_io functions (because you can always
> just update them with the one-liner "if not totally done, return" to get
> the old behaviour).
>
> Andrew? I really dislike the lack of concurrency in our current mpage
> read-ahead thing. Whaddayathink?

Well it is supposed to lay out two BIOs, but if that happens, it's
fragile - it relies on BIO_MAX_SIZE=64k and default readahead=128k.

What I think we need to do here is to get Jens' bio_add_page() stuff
up and running and to then go through the device drivers and set their
max BIO size to something which is inversely proportional to the
device's expected bandwidth.

This way, the floppy readahead would lay out 1- or 2-page BIOs, and
the honking FC array would lay out 128k or larger BIOs.

(In fact, I would prefer that the device's nominal read- and write-
bandwidths be made available in some manner. This way the VM can
make these granularity-size decisions for readahead, and the VM
can then solve the machine-full-of-dirty-memory-against-a-slow-device
problem. But the non-blocking page reclaim code probably solves that
too).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Chris Friesen: "Re: ARP and alias IPs"
Previous message: Oleg Drokin: "Re: [reiserfs-dev] Re: [PATCH] sparc32: wrong type of nlink_t"
Maybe in reply to: Linus Torvalds: "One more bio for for floppy users in 2.5.33.."
Next in thread: Andrew Morton: "Re: One more bio for for floppy users in 2.5.33.."