Re: [patch] 2.4.19-pre10-ac2: O(1) scheduler merge, -A3.

Ingo Molnar (mingo@elte.hu)
Mon, 17 Jun 2002 11:00:26 +0200 (CEST)


On Mon, 17 Jun 2002, Zwane Mwaikambo wrote:

> > > Can we add a config time option for irqbalance? I consider it extra
> > > overhead for setups which can do the interrupt distribution via hardware
> > > properly, [...]
> >
> > What x86 hardware do you have in mind?
>
> ye olde generic x86 SMP box, the interrupt handling imbalance came about
> with the P4 and their newer APIC setup did it not? Although i am aware
> that some x86 SMP boxes don't do the distribution properly too, thats
> why i reckon config option would be best.

even generic x86 SMP boxes benefit from irqbalance due to better irq
affinity. Actually one could hardly find a worse way to distribute
interrupts than the IO-APIC + lowest priority delivery mode does... [in
fact there is one, the P4's do it ;-) ]

> > > [...] also irqbalance breaks NUMAQ horribly seeing as it assumes a
> > > number of things like addressing modes.
> >
> > exactly what does it assume that breaks NUMAQ?
>
> <Disclaimer>
> I am not a NUMAQ expert and do not even have access to one for testing
> </Disclaimer>
>
> The addressing mode, irq_balance assumes that the addressing mode is
> logical mode (when programming the IOREDTBL entries), whilst NUMAQ uses
> a completely different addressing architecture. [...]

irqbalance uses the set_ioapic_affinity() method to set affinity. The
clustered APIC code is broken if it doesnt handle this properly. (i dont
have such hardware so i cant tell, but it indeed doesnt appear to handle
this case properly.) By wrapping around at node boundary the irqbalance
code will work just fine.

> [...] Also another thing is consider this situation;
>
> irqbalance programs IOAPIC#0 on node0 to deliver to CPU#6 on node1
>
> Will that interrupt get delivered?

i agree that this could be a problem, but set_ioapic_affinity() can be
made dependent on the actual NUMA setup that is used. This is absolutely
needed anyway for a proper /proc/irq/*/smp_affinity feature.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/