Re: [RFC] Early flush (was: spindown)

Daniel Phillips (phillips@bonn-fries.net)
Sat, 23 Jun 2001 07:10:45 +0200


On Saturday 23 June 2001 01:25, Daniel Kobras wrote:
> On Wed, Jun 20, 2001 at 10:12:38AM -0600, Richard Gooch wrote:
> > Daniel Phillips writes:
> > > I'd like that too, but what about sync writes? As things stand now,
> > > there is no option but to spin the disk back up. To get around this
> > > we'd have to change the basic behavior of the block device and
> > > that's doable, but it's an entirely different proposition than the
> > > little patch above.
> >
> > I don't care as much about sync writes. They don't seem to happen very
> > often on my boxes.
>
> syslog and some editors are the most common users of sync writes. vim,
> e.g., per default keeps fsync()ing its swapfile. Tweaking the configuration
> of these apps, this can be prevented fairly easy though. Changing sync
> semantics for this matter on the other hand seems pretty awkward to me. I'd
> expect an application calling fsync() to have good reason for having its
> data flushed to disk _now_, no matter what state the disk happens to be in.
> If it hasn't, fix the app, not the kernel.

But apps shouldn't have to know about the special requirements of laptops.
I've been playing a little with the idea of creating a special block device
for laptops that goes between the vfs and the real block device, and adds the
behaviour of being able to buffer writes in memory. In all respects it would
seem to the vfs to be a disk. So far this is just a thought experiment.

> > > You know about this project no doubt:
> > >
> > > http://noflushd.sourceforge.net/
> >
> > Only vaguely. It's huge. Over 2300 lines of C code and >560 lines in
> > .h files! As you say, not really lightweight. There must be a better
> > way.
>
> noflushd would benefit a lot from being able to set bdflush parameters per
> device or per disk. So I'm really eager to see what Daniel comes up with.
> Currently, we can only turn kupdate either on or off as a whole, which
> means that noflushd implements a crude replacement for the benefit of
> multi-disk setups. A lot of the cruft stems from there.

Yes, another person to talk to about this is Jens Axboe who has been doing
some serious hacking on the block layer. I thought I'd get the early flush
patch working well for one disk before generalizing to N ;-)

> > Also, I suspect (without having looked at the code) that it
> > doesn't handle memory pressure well. Things may get nasty when we run
> > low on free pages.
>
> It doesn't handle memory pressure at all. It doesn't have to. noflushd only
> messes with kupdate{,d} but leaves bdflush (formerly known as kflushd)
> alone. If memory gets tight, bdflush starts writing out dirty buffers,
> which makes the disk spin up, and we're back to normal.

Exactly. And in addition, when bdflush does wake up, I try to get kupdate
out of the way as much as possible, though I've been following the
traditional recipe and having it submit all buffers past a certain age. This
is quite possibily a bad thing to do because it could starve the swapper.
Ouch.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/