Re: 2.5: ext3 bug or dying drive?

Andrew Morton (akpm@digeo.com)
Thu, 05 Dec 2002 14:03:38 -0800


Robert Love wrote:
>
> Overnight, 2.5.50-mm1 took a big stinky shit:
>
> ...
>
> Rebooted and ext3 replayed the journal and said a manual check was
> needed due to I/O error on the journal.

That'll be e2fsck saying that, when it tries to do journal replay.
I/O errors on the journal during replay not good.

Were there no I/O error messages reported from the device driver,
block, buffer or pagecache layer? Generally everyone like to have
a shout as one flies past.

> Ran fsck manually, it found a
> whole bunch of orphan inodes including some scary errors like "inode
> part of corrupt orphan inode list" or similar.
>
> Rebooted again to force another fsck to be sure, and sure enough it
> found more problems. Ugh. I started thinking bad hard drive.
>
> Back up in X, and the same dmesg error occurred again. Repeat above.
>
> Now I am in 2.4 and all seems well. So perhaps not hard drive?

Well. Changed driver, scsi layer, block layer, VFS and ext3. Could
be anywhere :(

> IBM U2W drive on a 2940U2W if it matters. UP kernel.

It would be useful to give the IO system a bit of a thrashing,
to narrow the problem down. Just a `cat /dev/sda[n] > /dev/null'
would suit.

Bottom line: dunno.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/