Re: [PATCH] 2.2.17pre7 VM enhancement Re: I/O performance on

Andrea Arcangeli (andrea@suse.de)
Wed, 28 Jun 2000 18:45:51 +0200 (CEST)


On Mon, 26 Jun 2000, Rik van Riel wrote:

>Actually that's not the case. You're forgetting that
>kswapd _frees_ memory...

So you want to do a very different thing: you want to wakeup kflushd each
time you free a page in 2.4.x or each time you succeed in freeing a buffer
in 2.2.x. So you want to do check the balance_dirty thing in the success
path (not in the fail path!!).

I think it would be a little overkill, we just wakeup kswapd as soon as we
find a busy buffer, that should be enough in practice (even more in 2.4.x
where we only walk on the cache thanks to the lru).

BTW, the below thing that is been added here in 17pre6:

/*
* Wait for async IO to complete
* at each 64 buffers
*/

int wait = ((gfp_mask & __GFP_IO)
&& (!(nr_dirty++ % 64)));

doesn't make sense to me.

As first the comment is wrong since the __GFP_IO check shortcut the
nr_dirty++ sometime (I guess that was not wanted). Then "64" magic
value is just a random value.

Then suppose there's 1 locked buffer in all the VM and suppose there are
lots of clean and freeable buffers (more than 64 buffers at least). Why
the heck should we wait I/O completation for such async buffer generated
previously by `cp` (where not even `cp` is waiting synchronously for it
because it was a write) (maybe it's also getting written to a sloww
floppy) while I do a little malloc?

I see we may have at some point to wait for I/O completion if all the VM
happens to be somehow dirty and I have ideas on how doing it with care
(and no, not as 2.4.x either, and btw in 2.4.x there's the buggy shortcut
side effect too).

But before I even go to implement the above we have first to account the
dirty pages in the MAP_SHARED segments as __dirty__. That is __strictly__
necessary for allowing the machine to allocate without blocking while
there's heavy I/O pressure and little non-write-I/O related memory
allocation pressure. And fixing that is probaly going also solve all the
oom problems reported (even if I see that to be completly correct we
should also be able to wait for dirty memory objects to return clean but
that's another issue).

We must solve that _first_ and _right_ for the MAP_SHARED segment too (and
the way to solve that is _completly_ different to whatever I seen floating
around so far). Incidentally solving that isn't that trivial since it
involves the page fault path (->no_page and do_wp_page at least) plus
changes in the way we collect the map_shared dirty pages, and and I
believe it should be fixed in 2.4.x only as first.

For 2.2.x I still suggest all the patches I listed in my earlier email to
the list. Too large MAP_SHARED (aka mmap002) will still not be reliable
(you will need to buy more RAM) but there won't be side effects against
the cases that just works right.

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/