Re: 2.5.74-mm1

William Lee Irwin III (wli@holomorphy.com)
Sat, 5 Jul 2003 15:03:33 -0700


William Lee Irwin III <wli@holomorphy.com> wrote:
>> Sorry, that was one hell of an oversight wrt. the initival value.

On Sat, Jul 05, 2003 at 02:27:52PM -0700, Andrew Morton wrote:
> You made me read that code about 20 times ;)
> Still. Do we think we know what the actual bug is? That tasklist_lock
> doesn't pin tsk->mm?
> If so then let's get that patch of yours happening, but please enhance it to
> a) detect the situation where the mm went away, and tell us that it was
> fixed up. Sufficient to confirm your theory.
> b) put in an explicit check for a kill of an mm-less process. print a
> warning, skip the process.

exit_mm() is excluded by the task_lock(), so p->mm (or any of the
others) can vanish at any time (the callers don't hold the tasklist_lock
either). So did you have in mind something like this?

-- wli

===== mm/oom_kill.c 1.23 vs edited =====
--- 1.23/mm/oom_kill.c Wed Apr 23 03:15:53 2003
+++ edited/mm/oom_kill.c Sat Jul 5 15:00:15 2003
@@ -141,8 +141,16 @@
* CAP_SYS_RAW_IO set, send SIGTERM instead (but it's unlikely that
* we select a process with CAP_SYS_RAW_IO set).
*/
-void oom_kill_task(struct task_struct *p)
+static void __oom_kill_task(task_t *p)
{
+ task_lock(p);
+ if (!p->mm || p->mm == &init_mm) {
+ WARN_ON(1);
+ printk(KERN_WARNING "tried to kill an mm-less task!\n");
+ task_unlock(p);
+ return;
+ }
+ task_unlock(p);
printk(KERN_ERR "Out of Memory: Killed process %d (%s).\n", p->pid, p->comm);

/*
@@ -161,6 +169,16 @@
}
}

+static struct mm_struct *oom_kill_task(task_t *p)
+{
+ struct mm_struct *mm = get_task_mm(p);
+ if (!mm || mm == &init_mm)
+ return NULL;
+ __oom_kill_task(p);
+ return mm;
+}
+
+
/**
* oom_kill - kill the "best" process when we run out of memory
*
@@ -171,9 +189,11 @@
*/
static void oom_kill(void)
{
+ struct mm_struct *mm;
struct task_struct *g, *p, *q;

read_lock(&tasklist_lock);
+retry:
p = select_bad_process();

/* Found nothing?!?! Either we hang forever, or we panic. */
@@ -182,17 +202,21 @@
panic("Out of memory and no killable processes...\n");
}

- oom_kill_task(p);
+ mm = oom_kill_task(p);
+ if (!mm)
+ goto retry;
/*
* kill all processes that share the ->mm (i.e. all threads),
* but are in a different thread group
*/
do_each_thread(g, q)
- if (q->mm == p->mm && q->tgid != p->tgid)
- oom_kill_task(q);
+ if (q->mm == mm && q->tgid != p->tgid)
+ __oom_kill_task(q);
while_each_thread(g, q);
-
+ if (!p->mm)
+ printk(KERN_INFO "Fixed up OOM kill of mm-less task\n");
read_unlock(&tasklist_lock);
+ mmput(mm);

/*
* Make kswapd go out of the way, so "p" has a good chance of
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/