Re: What to expect with the 2.6 VM

Andrew Morton (akpm@digeo.com)
Mon, 30 Jun 2003 20:02:37 -0700


Andrea Arcangeli <andrea@suse.de> wrote:
>
> On Tue, Jul 01, 2003 at 02:39:47AM +0100, Mel Gorman wrote:
> > Reverse Page Table Mapping
> > ==========================
> ...
>
> you mention only the positive things, and never the fact that's the most
> hurting piece of kernel code in terms of performance and smp scalability
> until you actually have to swapout or pageout.

It has no SMP scalability cost, and not really much CPU cost (it's less
costly than HZ=1000, for example). Its main problem is space consumption.

> > Non-Linear Populating of Virtual Areas
> > ======================================
> ...
>
> and it was used to break truncate,

Spose so. Do we care though? Unix standards do not specify truncate
behaviour with nonlinear mappings anyway.

Our behaviour right now is "random crap". If there's a reason why we want
consistent semantics then yes, we'll need to do an rmap walk or something
in there. But is there a requirement? What is it?

One thing which clearly _should_ have sane semantics with nonlinear
mappings is mincore(). MAP_NONLINEAR broke that too.

> > the flags are implemented in many different parts of the kernel.
> > The
> > NOFAIL flag requires teh VM to constantly retry an allocation until it
>
> described this way it sounds like NOFAIL imply a deadlock condition.

NOFAIL is what 2.4 has always done, and has the deadlock opportunities
which you mention. The other modes allow the caller to say "don't try
forever".

It's mainly a cleanup - it allowed the removal of lots of "try forever"
loops by consolidating that behaviour in the page allocator. If all
callers are fixed up to not require NOFAIL then we don't need it any more.

> as for the per-zone lists, sure they increase scalability, but it loses
> aging information, the worst case will be reproducible on a 1.6G box,

Actually it improves aging information. Think of a GFP_KERNEL allocation
on an 8G 2.4.x box: an average of 10 or more highmem pages get bogusly
rotated to the wrong end of the LRU for each scanned lowmem page.

That's speculation btw. I don't have any numbers or tests which indicate
that it was a net win in this regard.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/