For the ISDN one:
E001B - EURO ISDN cause Out of order mean, that here is no answer from
the exchange while trying to establish a D-channel L2 connection.
This may be have various reasons: broken cable, wrong addresses, no
IRQs. The no IRQ may (but don't must) related to APIC errors.
I have here the same board with 2*233 MMX and don't see this kind of ISDN
error on recent 2.2 kernels, but got also lot of APIC errors with the
2.3/2.4, because the APIC errors are only reported in 2.3/4.
> They system is still usable after such an error, only that eth0/isdn is
> not accessible, even if I reload the modules. The only solution
> is a reboot.
>
> Well - some days ago I tried to switch to 2.4.3, hoping that these
> errors will be gone then. The first thing that I noticed was that I got
> thousands of lines like this:
>
> Apr 22 16:19:31 violin kernel: APIC error on CPU0: 04(00)
No the kernel cannot change this, since it is a hardware problem.
The GA586DX is known that it produce lot of checksum errors on the APIC
bus, in 2.4 these are reported in 2.2 they are simple ignored, but also
here. These errors itself are not a problem since the APIC bus detect
it and recover, but if here are double errors in a way that the checksum
is OK, the APIC may run in trouble.
> Errors!) the isdn subsystem died:
> Apr 18 16:32:12 violin kernel: isdn_tx_timeout dev ippp0 dialstate 0
> Apr 18 16:32:12 violin kernel: ippp0: all channels busy - requeuing!
Yes that is also a hint that the IRQ of the card is blocked.
> Following the advice of Donald Becker he gave in some newsgroup I
> restarted the
> kernel with the "noapic" parameter. The strange thing is that the APIC
> errors are still there, at least there are a lot less than before,
> moreover the system seems slower but at least more stable. BTW, why are
> there still APIC errors although there are no interrupts assigned to
> CPU1 (as seen in /proc/interrupts).
>
Yes, no APIC means all IRQ are handled by one CPU only, so communication
errors about IRQ events on the APIC bus don't care.
> I next tried to find out what triggers these APIC errors:
>
> Without "noapic" kernel parameter:
> The Errors are triggered by a certain amount of interrupts, whatever
> device produces interrupts.
>
> With "noapic":
> It seems as if those errors are mostly triggered by NFS. When I copy the
> same
> amount of data with FTP, there are a lot less Errors. (E.g. for 500MB
> there
> are 40 with NFS and only 2 with FTP).
I don't know all kinds of events the APIC bus is used for, it is not only
for the IRQs.
> What I wonder is why linux outputs a line like this (with noapic):
> <4>Intel MultiProcessor Specification v1.1
> <4> Virtual Wire compatibility mode.
>
> although the board seems to be capable of MPS 1.4 (as there is a Bios
> option "MPS 1.4 for single Processor).
>
One or 2 years ago I was playing with these options, it seemed that setting
it to 1.1 reduce the error count a little bit, but this maybe a
misinterpretation.
-- Karsten Keil SuSE Labs ISDN development - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/