Re: out_of_memory() heuristic broken for different mem configurations

Marcelo Tosatti (marcelo@conectiva.com.br)
Tue, 6 Nov 2001 12:19:48 -0200 (BRST)


On Tue, 6 Nov 2001, Linus Torvalds wrote:

>
> On Tue, 6 Nov 2001, Marcelo Tosatti wrote:
> >
> > Looking at out_of_memory() I've found out that we will only kill a task if
> > we happen to call out_of_memory() ten times in one second.
>
> It should be "10 times in at _least_ one second, and at most 5 seconds
> apart", but yes, I can imagine that it doesn't work very well if you have
> tons of memory.
>
> The problem became more pronounced when we started freeing the swap cache:
> it seems that allows the machine to shuffle memory around so efficiently
> that I've seen it go for half a minute with vmstat claiming zero IO, and
> yet very few out-of-memory messages - I don't know why shrink_cache()
> _claims_ success under those circumstances, but I've seen it myself.
> Shrink_cache _should_ only return success when it has actually dropped a
> page that needs re-loading, but..
>
> > (end of swap)
> >
> > 23 1 1 4032960 2892 232 7648 16 24246 22 24252 235 997 0 72 28
> > 22 1 1 4032960 2872 232 7648 0 0 0 0 106 13 0 99 0
> > 23 0 2 4032960 2764 232 7648 0 0 0 0 116 11 0 100 0
> > 21 0 1 4032960 2856 232 7648 0 0 0 0 122 45 0 100 0
> > 21 0 1 4032960 2848 232 7648 0 0 0 0 117 21 0 100 0
> > 21 0 1 4032960 2588 232 7648 38 0 38 0 118 41 0 100 0
> > 21 0 1 4032960 2584 232 7648 0 0 0 0 123 10 0 100 0
>
> Note how you also go for seconds with no IO and no shrinking of the
> caches, while shrink_cache() is apparently happy (and no, it does not take
> several seconds to traverse even a 16GB inactive queue, there's something
> else going on)
>
> With the more aggressive max_mapped, the oom failure count could be
> dropped to something smaller, as a false positive from shrink_caches
> should be fairly rare. I don't think it needs to be tunable on memory
> size, I just didn't even try any other values on my machines (I noticed
> that the old values were too high once max_mapped was upped and the swap
> cache reclaiming was re-done, but I didn't try if five seconds and ten
> failures was any better than 10 seconds and five failures, for example)

Ok, I'll take a careful look at shrink_cache()/try_to_free_pages() path
later and find out "saner" magic numbers for big/small memory workloads.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/