Re: [patch 3/4] slab reclaim balancing

Manfred Spraul (manfred@colorfullife.com)
Fri, 27 Sep 2002 19:24:56 +0200


Ed Tomlinson wrote:
>
> There is no dispute that in some cases it will be slower from a slab perspective. As
> Andrew and you have discussed there are things that can be done to speed things
> up. Is not the question really, "Are the vm and slab faster together when slab pages
> are freed asap?"
>

Some caches are quite bursty - what about the 2 kB generic cache that is
used for the MTU sized socket buffers? With interrupt mitigation
enabled, I'd expect that a GigE nic could allocate a few dozend 2kb
objects in every interrupt, and I don't think it's the right approach to
effectively disable the cache in slab.c for such loads.

I do not have many data points, but in a netbench run on 4-way Xeon,
kmem_cache_free is called 5 million times/minute, and additional 4
million calls to kfree - I agree that _reap right now is bad, but IMHO
it's questionable if the fix should be inside the hot-path of the allocator.

What about this approach:

* enable batching even on UP, with a LIFO array in front of the lists.

* After flushing a batch back into the lists, the number of free objects
in the lists is calculated. If freeable pages exist and the number
exceeds a target, then the freeable pages above the target are returned
to the page buddy.
* The target of freeable pages is increased by kmem_cache_grow - if we
had to get another page from gfp, then our own cache was too small.

Since the test for the number of freeable objects only happens after
batching, i.e. in the worst case once for every 30 kmem_cache_free
calls, it doesn't matter if it's a bit expensive.

Open problems:

* What about cache with large objects (>PAGE_SIZE, e.g. the bio
MAX_PAGES object, or the 16 kb socket buffers used over loopback)? Right
now, they are not cached in the per-cpu arrays, to reduce the memory
pressure. If the list processing becomes slower, we would slow down
these slab users. But OTHO if you memcpy 16 kB, then a few cycles in
kmalloc probably won't matter much.

* Where to flush the per-cpu caches? On a 16-way system, they can
contain up to 4000 objects, for each cache. Right now that happens in
kmem_cache_reap(). One flush per second would be enough, just to avoid
that on lightly loaded slabs, objects remain forever in the per-cpu
arrays and prevent pages from becoming freeable.

* where is the freeable pages limit decreased?

--
	Manfred

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/