Re: [RFC] My research agenda for 2.7

Mel Gorman (mel@csn.ul.ie)
Sun, 29 Jun 2003 20:25:30 +0100 (IST)


On Fri, 27 Jun 2003, Daniel Phillips wrote:

> > http://www.research.att.com/sw/tools/vmalloc/
> > (Vmalloc: A Memory Allocation Library)
>
> Whoops, I retract that. I did a quick scan of the page and ran into a link
> labled "Non-Commercial License Agreement and Software", attached to some form
> of license loaded with ugly patent words and so on.

The paper itself is still freely available and the idea could still be
implemented but I still think that the slab allocator will be sufficent
for order0 allocations and use caches to cluster the different types of
order0 allocations together. I've read that paper before and I think it is
overkill for the kernel because the allocation pattern isn't that
complicated in comparison to userspace memory management which that
allocator is aimed at. Not to mention that to get the most out of the
allocator, you need to know what your allocation pattern is going to be
like in advance.

Of course, at this point, I still need to write an RFC about using slab
for order0 allocations and ultimatly I'll have to put code where my emails
are :-/

To help show why slab might work though, I updated vmregress and released
0.10 available at http://www.csn.ul.ie/~mel/projects/vmregress/ . The main
addition is that it now can collect some basic statistics on the types of
page allocations. Specifically, with the trace_alloc module loaded and the
trace_pagealloc.diff patch applied (see kernel_patches/ directory),
/proc/vmregress/trace_alloc will print out the number of allocations for
each order that used either the GFP_KERNEL or GFP_USER flags.

To test it, I booted a clean 2.5.73 system[1], started X and ran
konqueror. The results were;

Kernel allocations
------------------
BeforeX After X
Order 0: 6886 14595
Order 1: 335 383
Order 2: 17 23
Order 3: 2 2
Order 4: 3 3
Order 5: 0 0
Order 6: 0 0
Order 7: 0 0
Order 8: 0 0
Order 9: 0 0
Order 10: 0 0

Userspace allocations
---------------------
Order 0: 53050 71505
Order 1: 1 1
Order 2: 0 0
Order 3: 0 0
Order 4: 0 0
Order 5: 0 0
Order 6: 0 0
Order 7: 0 0
Order 8: 0 0
Order 9: 0 0
Order 10: 0 0

As you can see, order0 allocations were a *lot* more common, at least in
my system. Because they are so common in comparison to other orders, I
think that putting order0 in slabs of size 2^MAX_ORDER will make
defragmentation *so* much easier, if not plain simple, because you can
shuffle around order0 pages in the slabs to free up one slab which frees
up one large 2^MAX_ORDER adjacent block of pages.

It's possible that a journalled filesystem (or some other subsystem I am
not familiar with) would show that order > 0 allocations are a lot more
important than my system shows but the means to collect the information is
in vmregress so be my guest. When the time requires, I'll extend vm
regress to see how fragmented memory actually gets.

Much much later, it would be worth looking at the buddy allocator and
seeing would it be worth changing to something much simpler as most of its
work would be handled by the slab allocator instead.

[1] Tonnes of errors were spewed out on uninitialised timers. With highmem
enabled, I got a sleeping while atomic error warning in a loop caused
by default_idle(). I still have to see can I get around to tracking
down if it's a PEBKAC problem or a kernel bug

-- 
Mel Gorman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/