Re: Strange panic as soon as timer interrupts are enabled (recent 2.5)

Andrew Morton (akpm@digeo.com)
Wed, 06 Nov 2002 11:46:35 -0800


"J.E.J. Bottomley" wrote:
>
> Martin.Bligh@us.ibm.com said:
> > Conversations on IRC revealed that jejb has hit the same thing on
> > voyager ... as I understood him, he felt the cause was the CPU was
> > taking an interrupt before cpu_up was called, and the interrupt was
> > going back through a non existent tasklet structure (tasklets now
> > have per_cpu areas which are allocated as the cpu comes up) I'll let
> > him discuss what he did to fix that, but the ensuing discussion made
> > me think that taking this out to a wider audience for an appropriate
> > long-term solution would be prudent.
>
> Yes, this caused for me, a completely reliable boot time panic with 2.5.46.
> The problem is that per_cpu areas aren't initiallised until cpu_up is called,
> so a cpu cannot now take an interrupt before cpu_up is called.

Rusty's da man on this, but I think the fix is to not turn on
the interrupts (at the APIC level) until cpu_up() has called
__cpu_up(). Look at cpu_up():

ret = notifier_call_chain(&cpu_chain, CPU_UP_PREPARE, hcpu);
if (ret == NOTIFY_BAD) {
printk("%s: attempt to bring up CPU %u failed\n",
__FUNCTION__, cpu);
ret = -EINVAL;
goto out_notify;
}

/* Arch-specific enabling code. */
ret = __cpu_up(cpu);

The softirq storage is initialised inside the CPU_UP_PREPARE call.
So we're ready for interrupts on that CPU when your architecture's
__cpu_up() is called. And no sooner than this.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/