Re: Via KT133 pci corruption: stock 2.4.18pre2 oopses as well

Ville Herva (vherva@niksula.hut.fi)
Wed, 9 Jan 2002 23:57:22 +0200


On Wed, Jan 09, 2002 at 01:00:53PM -0800, you [Andrew Morton] claimed:
> Ville Herva wrote:
> >
> > >>EIP; c0131ce0 <sync_page_buffers+10/b0> <=====
>
> Looks like a corrupted `next' pointer in the page's buffer_head
> ring. Your report is identical to Todd Eigenschink's repeatable
> oops. http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.3/0689.html
>
> In another thread, yesterday, we were discussing the elusive
> "end_request: buffer-list destroyed" crash.

(...)

> There were VM changes, and a messy, complex and undocumented change to
> sync_page_buffers(), which was the point at which I ceased to understand
> that function.

Nice, yet one more variable to the equation ;). And I thought I could rule
out kernel bugs by reproducing this on supposedly stable kernel (the 2.2.20
I used had all sort of patches in it; ide, e2compr and raid to name the
largest ones.)

This could be a sync_page_buffers() bug, but what puzzles me is that I can
reproduce the oopses on 2.2 as well (although they can of course be
different oopses).

Also, I'm seeing ide and network corruption that would very much point to
pci transfer corruption. Of course, it can be that the oopses are not caused
by that.

> It could just be some random memory scribbler. Dunno yet. It's awfully
> repeatable.

Yep.

-- v --

v@iki.fi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/