Re: NETDEV timeout on tulips [was: Re: 2.4.1-test10]

Jeff Garzik (jgarzik@mandrakesoft.com)
Tue, 23 Jan 2001 12:57:45 -0500


David Ford wrote:
>
> > > Do the tulip driver updates address the increasingly common NETDEV timeout
> > > repots?
> >
> > In general you can answer this yourself by reading
> > drivers/net/tulip/ChangeLog.
> >
> > I don't see increasingly common timeout reports.. with which hardware?
> > They are likely on the newer LinkSys 4.1 cards, and there are still
> > problesm with PNIC. Outside of that, other cards should be ok.
>
> I have four machines now that exhibit this problem. Three have in them the
> Linksys card family, similar PCI cards, one is my laptop which I have three
> different cardbus cards but they all use the tulip driver.
>
> In the PCI situation, not all machines using these cards act the same way.
> I got a 10 pack of LNE100TX cards and so far only two out of the batch are doing
> this, they are all the same revision, identical in every way that I've found.
>
> The three cardbus cards are slightly different in numerous ways. For them they
> normally fault with an APM event, an eject/insert cycle via software will reset
> them and a link down/up won't fix it. For the PCI cards most times a link
> down/up cycle will fix them. It's a 2.4 v.s. 2.2 issue, the 2.2 kernels aren't
> exhibiting this error.

Sounds like the PCI PM state is getting mangled. Can you provide a
"lspci -vvv", as root, for each of the three cardbus cards? Make sure
to run lspci when the cards are up and active and working.

> The PCI cards are hard to get into this state, sometimes they'll run millions of
> packets for months on end before they'll burp. Sometimes it'll happen three
> times a night. The amount of traffic doesn't seem to matter, nor does the type
> of traffic.
>

> 00:0a.0 Ethernet controller: Lite-On Communications Inc LNE100TX (rev 20)
> Subsystem: Netgear FA310TX

If the link is getting lost (which may explain the randomness of the
error), the following patch might help:
http://sourceforge.net/patch/?func=detailpatch&patch_id=103294&group_id=13004

There are still some media fixes that need to be integrated from the
Becker driver, and tested, too.

Also, downloading tulip-diag.c and capturing the register state before
and after the breakage is useful. A useful command line is "tulip-diag
-mmmaaavvveef".

Jeff

-- 
Jeff Garzik       | "You see, in this world there's two kinds of
Building 1024     |  people, my friend: Those with loaded guns
MandrakeSoft      |  and those who dig. You dig."  --Blondie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/