Re: Status of the IO scheduler fixes for 2.4

Andrea Arcangeli (andrea@suse.de)
Sat, 5 Jul 2003 03:04:48 +0200


On Fri, Jul 04, 2003 at 08:47:00PM -0400, Chris Mason wrote:
> On Fri, 2003-07-04 at 20:05, Andrea Arcangeli wrote:
> > On Fri, Jul 04, 2003 at 05:37:35PM -0400, Chris Mason wrote:
> > > I've also attached a patch I've been working on to solve the latencies a
> > > different way. bdflush-progress.diff changes balance_dirty to wait on
> > > bdflush instead of trying to start io. It helps a lot here (both
> > > throughput and latency) but hasn't yet been tested on a box with tons of
> > > drives.
> >
> > that's orthogonal, it changes the write throttling, it doesn't touch the
> > blkdev layer like the other patches does. So if it helps it solves a
> > different kind of latencies.
>
> It's also almost useless without elevator-low-latency applied ;-) One
> major source of our latencies is a bunch of procs hammering on
> __get_request_wait, so bdflush-progess helps by reducing the number of
> procs doing io. It does push some of the latency problem up higher into
> balance_dirty, but I believe it is easier to manage there.
>
> bdflush-progress doesn't help at all for non-async workloads.

agreed.

> > However the implementation in theory can run the box oom, since it won't
> > limit the dirty buffers anymore. To be safe you need to wait 2
> > generations. I doubt in practice it would be easily reproducible though ;).
>
> Hmmm, I think the critical part here is to write some buffers after
> marking a buffer dirty. We don't check the generation number until

btw, not after a buffer, but after as much as "buffers per page" buffers
(infact for the code to be correct on systems with quite big page size
the 32 should be replaced with a max(bh_per_page, 32))

> after marking the buffer dirty, so if the generation has incremented at
> all we've made progress. What am I missing?

write 32 buffers
read generation
inc generation
generation changed -> OOM

Am I missing any smp serialization between the two cpus?

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/