Re: Linux 2.4.5-ac15

Daniel Phillips (phillips@bonn-fries.net)
Fri, 22 Jun 2001 02:32:00 +0200


On Thursday 21 June 2001 21:50, Marcelo Tosatti wrote:
> On Thu, 21 Jun 2001, Daniel Phillips wrote:
> > On Thursday 21 June 2001 07:44, Marcelo Tosatti wrote:
> > > On Thu, 21 Jun 2001, Mike Galbraith wrote:
> > > > On Thu, 21 Jun 2001, Marcelo Tosatti wrote:
> > > > > Ok, I suspect that GFP_BUFFER allocations are fucking up here (they
> > > > > can't block on IO, so they loop insanely).
> > > >
> > > > Why doesn't the VM hang the syncing of queued IO on these guys via
> > > > wait_event or such instead of trying to just let the allocation fail?
> > >
> > > Actually the VM should limit the amount of data being queued for _all_
> > > kind of allocations.
> > >
> > > The problem is the lack of a mechanism which allows us to account the
> > > approximated amount of queued IO by the VM. (except for swap pages)
> >
> > Coincidence - that's what I started working on two days ago, and I'm
> > moving into the second generation design today. Look at
> > 'queued_sectors'. I found pretty quickly it's not enough, today I'm
> > adding 'submitted_sectors' to the soup. This will allow me to
> > distinguish between traffic generated by my own thread and other traffic.
>
> Could you expand on this, please ?

OK, I am doing opportunistic flushing, so I want to know that nobody else is
using the disk, and so long as that's true, I'll keep flushing out buffers.
Conversely, if anybody else queues a request I'll bail out of the flush loop
as soon as I've flushed the absolute minimum number of buffers, i.e., the
ones that were dirtied more than bdflush_params->age_buffer ago. But how do
I know if somebody else is submitting requests? The surest way to know is to
have a sumitted_sectors counter that just counts every submission, and
compare that to the number of sectors I know I've submitted. (This counter
wraps, so I actually track the difference from value on entering the flush
loop).

The first thing I found (duh) is that nobody else ever submits anything while
I'm in the flush loop because I'm on UP and I never (almost never) yield the
CPU. On SMP I will get other threads submitting, but only rarely will the
submission happen while I'm in the flush loop. No good, I'm not detecting
the other disk activity reliably, back to the drawing board.

My original plan was to compute a running average of submission rates and use
that to control my opportunistic flushing. I departed from that because I
seemed to get good results with a much simpler strategy, the patch I already
posted. It's fundamentally flawed though - it works fine for constant light
load and constant full load, but not for sporadic loads. What I need is
something a lot smoother, more analog, so I'll return to my original plan.

What I want to notice is that the IO submission rate has fallen below a
certain level then, when the IO backlog has also fallen below a few ms worth
of transfers I can do the opportunistic flushing. In the flush loop I want
to submit enough buffers to make sure I'm using the full bandwidth, but not
so many that I create a big backlog that gets in the way of a surge in demand
from some other source. I'm still working out the details of that, I will
not post an updated patch today after all ;-)

By the way, there's a really important throughput benefit for doing this
early flushing that I didn't put in the list when I first wrote about it.
It's this: whenever we have a bunch of buffers dirtied, if the disk bandwidth
is available we want to load up the disk right away, not 5 seconds from now.
If we wait 5 seconds, we just wasted 5 seconds of disk bandwidth. Again,
duh. So my goal in doing this was initially do have it cost as little in
throughput as possible - I see now that it's actually a win for throughput.
End of discussion about whether to put in the effort or not.

--
Daniel

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/