Re: the oom killer

Marcelo Tosatti (marcelo@conectiva.com.br)
Fri, 5 Apr 2002 18:45:08 -0300 (BRT)


On Fri, 5 Apr 2002, Andrea Arcangeli wrote:

> On Fri, Apr 05, 2002 at 01:18:26AM -0800, Andrew Morton wrote:
> >
> > Andrea,
> >
> > Marcelo would prefer that the VM retain the oom killer. The thinking
> > is that if try_to_free_pages fails, then we're better off making a
> > deliberate selection of the process to kill rather than the random(ish)
> > selection which we make by failing the allocation.
> >
> > One example is at
> >
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=101405688319160&w=2
> >
> > That failure was with vm-24, which I think had the less aggressive
>
> vm-24 had a problem yes, that is fixed in the latest releases.
>
> > i/dcache shrink code. We do need to robustly handle the no-swap-left
> > situation.
> >
> > So I have resurrected the oom killer. The patch is below.
> >
> > During testing of this, a problem cropped up. The machine has 64 megs
> > of memory, no swap. The workload consisted of running `make -j0
> > bzImage' in parallel with `usemem 40'. usemem will malloc a 40
> > megabyte chunk, memset it and exit.
> >
> > The kernel livelocked. What appeared to be happening was that ZONE_DMA
> > was short on free pages, but ZONE_NORMAL was not. So this check:
> >
> > if (!check_classzone_need_balance(classzone))
> > break;
> >
> > in try_to_free_pages() was seeing that ZONE_NORMAL had some headroom
> > and was causing a return to __alloc_pages().
> >
> > __alloc_pages has this logic:
> >
> > min = 1UL << order;
> > for (;;) {
> > zone_t *z = *(zone++);
> > if (!z)
> > break;
> >
> > min += z->pages_min;
> > if (z->free_pages > min) {
> > page = rmqueue(z, order);
> > if (page)
> > return page;
> > }
> > }
> >
> >
> > On the first pass through this loop, `min' gets the value
> > zone_dma.pages_min + 1. On the second pass through the loop it gets
> > the value zone_dma.pages_min + 1 + zone_normal.pages_min. And this is
> > greater than zone_normal.free_pages! So alloc_pages() gets stuck in an
> > infinite loop.
>
> This is a bug I fixed in the -rest patch, that's also broken on numa.
> The deadlock cannot happen if you apply all my patches.

How did you fixed this specific problem?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/