Re: [patch] adjustments to dirty memory thresholds

William Lee Irwin III (wli@holomorphy.com)
Wed, 28 Aug 2002 20:49:57 -0700


William Lee Irwin III wrote:
>> I've already written the patch to address it, though of course, I can
>> post those traces along with the patch once it's rediffed. (It's trivial
>> though -- just a fresh GFP flag and a check for it before calling
>> out_of_memory(), setting it in mempool_alloc(), and ignoring it in
>> slab.c.) It requires several rounds of "un-throttling" to reproduce
>> the OOM's, the nature of which I've outlined elsewhere.

On Wed, Aug 28, 2002 at 02:58:20PM -0700, Andrew Morton wrote:
> That's a sane approach. mempool_alloc() is designed for allocations
> which "must" succeed if you wait long enough.
> In fact it might make sense to only perform a single scan of the
> LRU if __GFP_WLI is set, rather than the increasing priority thing.
> But sigh. Pointlessly scanning zillions of dirty pages and doing nothing
> with them is dumb. So much better to go for a FIFO snooze on a per-zone
> waitqueue, be woken when some memory has been cleansed. (That's effectively
> what mempool does, but it's all private and different).

Here's a stab in that direction, against 2.5.31. A trivially different
patch was tested and verified to solve the problems in practice. A
theoretical deadlock remains where a mempool allocator sleeps on general
purpose memory and is not woken when the mempool is replenished.

Cheers,
Bill

diff -urN linux-2.5.31-virgin/include/linux/gfp.h linux-2.5.31-nokill/include/linux/gfp.h
--- linux-2.5.31-virgin/include/linux/gfp.h 2002-08-10 18:41:24.000000000 -0700
+++ linux-2.5.31-nokill/include/linux/gfp.h 2002-08-28 02:22:55.000000000 -0700
@@ -17,6 +17,7 @@
#define __GFP_IO 0x40 /* Can start low memory physical IO? */
#define __GFP_HIGHIO 0x80 /* Can start high mem physical IO? */
#define __GFP_FS 0x100 /* Can call down to low-level FS? */
+#define __GFP_NOKILL 0x200 /* Should not OOM kill */

#define GFP_NOHIGHIO ( __GFP_WAIT | __GFP_IO)
#define GFP_NOIO ( __GFP_WAIT)
diff -urN linux-2.5.31-virgin/include/linux/slab.h linux-2.5.31-nokill/include/linux/slab.h
--- linux-2.5.31-virgin/include/linux/slab.h 2002-08-10 18:41:28.000000000 -0700
+++ linux-2.5.31-nokill/include/linux/slab.h 2002-08-28 02:22:55.000000000 -0700
@@ -24,7 +24,7 @@
#define SLAB_NFS GFP_NFS
#define SLAB_DMA GFP_DMA

-#define SLAB_LEVEL_MASK (__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_HIGHIO|__GFP_FS)
+#define SLAB_LEVEL_MASK (__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_HIGHIO|__GFP_FS|__GFP_NOKILL)
#define SLAB_NO_GROW 0x00001000UL /* don't grow a cache */

/* flags to pass to kmem_cache_create().
diff -urN linux-2.5.31-virgin/mm/mempool.c linux-2.5.31-nokill/mm/mempool.c
--- linux-2.5.31-virgin/mm/mempool.c 2002-08-10 18:41:19.000000000 -0700
+++ linux-2.5.31-nokill/mm/mempool.c 2002-08-28 02:22:55.000000000 -0700
@@ -186,7 +186,11 @@
unsigned long flags;
int curr_nr;
DECLARE_WAITQUEUE(wait, current);
- int gfp_nowait = gfp_mask & ~(__GFP_WAIT | __GFP_IO);
+ int gfp_nowait;
+
+ gfp_mask |= __GFP_NOKILL;
+
+ gfp_nowait = gfp_mask & ~(__GFP_WAIT | __GFP_IO | __GFP_NOKILL);

repeat_alloc:
element = pool->alloc(gfp_nowait, pool->pool_data);
diff -urN linux-2.5.31-virgin/mm/vmscan.c linux-2.5.31-nokill/mm/vmscan.c
--- linux-2.5.31-virgin/mm/vmscan.c 2002-08-10 18:41:21.000000000 -0700
+++ linux-2.5.31-nokill/mm/vmscan.c 2002-08-28 03:17:15.000000000 -0700
@@ -401,23 +401,24 @@

int try_to_free_pages(zone_t *classzone, unsigned int gfp_mask, unsigned int order)
{
- int priority = DEF_PRIORITY;
- int nr_pages = SWAP_CLUSTER_MAX;
+ int priority, status, nr_pages = SWAP_CLUSTER_MAX;

KERNEL_STAT_INC(pageoutrun);

- do {
+ for (priority = DEF_PRIORITY; priority; --priority) {
nr_pages = shrink_caches(classzone, priority, gfp_mask, nr_pages);
- if (nr_pages <= 0)
- return 1;
- } while (--priority);
+ status = (nr_pages <= 0) ? 1 : 0;
+ if (status || (gfp_mask & __GFP_NOKILL))
+ goto out;
+ }

/*
* Hmm.. Cache shrink failed - time to kill something?
* Mhwahahhaha! This is the part I really like. Giggle.
*/
out_of_memory();
- return 0;
+out:
+ return status;
}

DECLARE_WAIT_QUEUE_HEAD(kswapd_wait);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/