Re: 2.5.74-mm1 fails to boot due to APIC trouble, 2.5.73mm3 works.

Martin J. Bligh (mbligh@aracnet.com)
Fri, 04 Jul 2003 12:38:19 -0700


> On Fri, Jul 04, 2003 at 02:35:31AM -0700, William Lee Irwin III wrote:
>>>> Okay, now for the "final solution" wrt. sparse physical APIC ID's
>>>> in addition to what I hope is a fix for your bug. This uses a separate
>>>> bitmap type (of a NR_CPUS -independent width MAX_APICS) for physical
>>>> APIC ID bitmaps.
>>>> \begin{cross-fingers}
>
> On Fri, Jul 04, 2003 at 08:41:38AM -0700, Martin J. Bligh wrote:
>> Is it really necessary to turn half the apic code upside down in order
>> to fix this? What's the actual bugfix that's buried in this cleanup?
>> Despite the fact you seem to have gone out of your way to make this
>> hard to review, there are a few things I can see that strike me as odd.
>> Not necessarily wrong, but requiring more explanation.
>
> It's not a cleanup, and it doesn't touch trailing whitespace etc.

Maybe not, but it looks like one. Maybe if you actually explain
what you're trying to fix, and why?

I think this kind of change deserves a better explanation that
"I'm right" ... that's my main objection.

> On Fri, Jul 04, 2003 at 02:35:31AM -0700, William Lee Irwin III wrote:
>>> - if (i >= 0xf)
>>> + if (i >= APIC_BROADCAST_ID)
>
> On Fri, Jul 04, 2003 at 08:41:38AM -0700, Martin J. Bligh wrote:
>> Is that always correct? it's not equivalent.
>
> It is.

Explain. Not obvious to the casual observer.

> On Fri, Jul 04, 2003 at 02:35:31AM -0700, William Lee Irwin III wrote:
>>> diff -prauN mm1-2.5.74-1/include/asm-i386/mach-bigsmp/mach_apic.h physid-2.5.74-1/include/asm-i386/mach-bigsmp/mach_apic.h
>>> --- mm1-2.5.74-1/include/asm-i386/mach-bigsmp/mach_apic.h 2003-07-03 12:23:56.000000000 -0700
>>> +++ physid-2.5.74-1/include/asm-i386/mach-bigsmp/mach_apic.h 2003-07-04 02:47:45.000000000 -0700
>>> @@ -29,15 +29,15 @@ static inline cpumask_t target_cpus(void
>>> # define INT_DELIVERY_MODE dest_LowestPrio
>>> # define INT_DEST_MODE 1 /* logical delivery broadcast to all procs */
>>> -#define APIC_BROADCAST_ID (0x0f)
>>> +#define APIC_BROADCAST_ID (0xff)
>
> On Fri, Jul 04, 2003 at 08:41:38AM -0700, Martin J. Bligh wrote:
>> So ... you've tested that change on a bigsmp machine, right?
>> At least, provide some reasoning here. Like this comment further down the
>> patch ...
>
> APIC_BROADCAST_ID is an upper bound on valid physical APIC ID's as it
> is used in the code. That actually was commented in the patch.

I find it odd that this worked before then. Also seems to be a separate
issue from the rest of the patch. Is quite probably correct, is just
non-obvious in the context of the rest of the patch.

> On Fri, Jul 04, 2003 at 02:35:31AM -0700, William Lee Irwin III wrote:
>>> +/*
>>> + * this isn't really broadcast, just a (potentially inaccurate) upper
>>> + * bound for valid physical APIC id's
>>> + */
>
> On Fri, Jul 04, 2003 at 08:41:38AM -0700, Martin J. Bligh wrote:
>> Which makes the change just look wrong to me. If you're thinking
>> "physical clustered mode" that terminology just utterly confusing crap,
>> and the change is wrong, as far as I can see.
>
> The change is correct, and I am not thinking of any such thing.
> APIC_BROADCAST_ID's sole usage is for terminating loops over physical
> APIC ID's while setting the physical APIC ID's of IO-APIC's.

Why is Summit 0xF, and bigsmp 0xFF then?

> On Fri, Jul 04, 2003 at 02:35:31AM -0700, William Lee Irwin III wrote:
>>> +++ physid-2.5.74-1/include/asm-i386/mach-numaq/mach_apic.h
>>> 2003-07-04 02:45:17.000000000 -0700
>>>
>>> -static inline cpumask_t apicid_to_cpu_present(int logical_apicid)
>>> +static inline physid_mask_t apicid_to_cpu_present(int logical_apicid)
>>> {
>>> int node = apicid_to_node(logical_apicid);
>>> int cpu = __ffs(logical_apicid & 0xf);
>>>
>>> - return cpumask_of_cpu(cpu + 4*node);
>>> + return physid_mask_of_physid(cpu + 4*node);
>>> }
>
> On Fri, Jul 04, 2003 at 08:41:38AM -0700, Martin J. Bligh wrote:
>> Hmmmm. What are you using physical apicids here for? They seem
>> irrelevant to this function.
>
> Look at where it's used.

I did. Still unclear why you think this is correct, or what physical
apicids have to do with a function that maps from apicids to the
phys_cpu_present_map, which is a compact mapping of logical apicids
for NUMA-Q.

Sorry, but this needs more explanation.

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/