ext2 corruption in 2.4.2, scsi only system

Dale E Martin (dmartin@cliftonlabs.com)
Mon, 26 Mar 2001 10:40:19 -0500


Hello. I've got a dual PPro machine running 2.4.2, and Debian (stable + a
little bit of "testing".) This machine is heavily loaded about 3/4 of the
time, doing a daily regression test on a project we're working on.
Bascially, the machine runs g++ about 18 hours a day on machine generated
code.

With 2.2.17 the machine would have process hangs after a couple of days -
this was repeatable. (specifically g++ would start hanging, even compiling
"hello world".)

I had had good luck with 2.4.x on other boxes, so I put it on this machine
as well. Several times now I've seen ext2 corruption with no other
noteworthy logs.

The last time we saw this after no other kernel messages:
Mar 23 18:40:02 woodlawn kernel: EXT2-fs error (device sd(8,6)):
ext2_free_blocks: Freeing blocks in system zones - Block = 655552, count = 1
Mar 23 19:18:33 woodlawn kernel: EXT2-fs error (device sd(8,6)):
ext2_new_block: Allocating block in system zone - block = 655552
Mar 24 18:40:04 woodlawn kernel: EXT2-fs error (device sd(8,6)):
ext2_free_blocks: Freeing blocks in system zones - Block = 655552, count = 1
Mar 24 19:07:22 woodlawn kernel: EXT2-fs error (device sd(8,6)):
ext2_new_block: Allocating block in system zone - block = 655552
Mar 25 16:45:01 woodlawn kernel: EXT2-fs error (device sd(8,1)):
ext2_free_blocks: bit already cleared for block 460146
Mar 25 16:45:01 woodlawn kernel: Remounting filesystem read-only
Mar 25 18:40:03 woodlawn kernel: EXT2-fs error (device sd(8,6)):
ext2_free_blocks: Freeing blocks in system zones - Block = 655552, count = 1

On the boot prior, we saw this:
Mar 8 11:34:08 woodlawn kernel: EXT2-fs error (device sd(8,6)):ext2_free_blocks: Freeing blocks in system zones - Block = 720946, count = 1
Mar 8 13:10:53 woodlawn kernel: EXT2-fs error (device sd(8,6)):ext2_new_block:
Allocating block in system zone - block = 720946
Mar 8 13:13:49 woodlawn kernel: EXT2-fs error (device sd(8,6)): ext2_free_block
s: Freeing blocks in system zones - Block = 720946, count = 1
Mar 8 15:32:52 woodlawn kernel: EXT2-fs error (device sd(8,6)): ext2_new_block:
Allocating block in system zone - block = 720946
Mar 8 15:35:20 woodlawn kernel: EXT2-fs error (device sd(8,6)): ext2_free_block
s: Freeing blocks in system zones - Block = 720946, count = 1
Mar 9 06:26:07 woodlawn kernel: init_special_inode: bogus imode (52171)
Mar 9 06:26:07 woodlawn kernel: init_special_inode: bogus imode (30564)
Mar 9 06:26:07 woodlawn kernel: init_special_inode: bogus imode (51522)
Mar 9 06:26:07 woodlawn kernel: init_special_inode: bogus imode (71400)
Mar 9 06:26:07 woodlawn kernel: init_special_inode: bogus imode (70072)
Mar 9 06:26:07 woodlawn kernel: init_special_inode: bogus imode (72163)
Mar 9 06:26:07 woodlawn kernel: init_special_inode: bogus imode (57522)
Mar 9 06:26:07 woodlawn kernel: init_special_inode: bogus imode (30062)
Mar 9 06:26:07 woodlawn kernel: init_special_inode: bogus imode (31137)
Mar 9 06:26:07 woodlawn kernel: init_special_inode: bogus imode (30532)
Mar 9 06:26:07 woodlawn kernel: init_special_inode: bogus imode (0)
Mar 9 06:26:07 woodlawn kernel: init_special_inode: bogus imode (0)
Mar 9 06:26:07 woodlawn kernel: init_special_inode: bogus imode (71563)

At that point, there were some large and not easily removed files that
appeared on the filesystem in question.

The machine is a dual PPro, it has a Buslogic BT958 with a single 9G
scsi/wide drive in it. There aren't any logs related to physical disk
problems. If the machine starts doing this again, I'll take a look in
/proc/scsi/Buslogic and see if it's showing any errors in there.

Thanks for any help, and please let me know if I need to supply any other
info. I guess the only other thing I can think of is the kernel is
compiled with gcc 2.95.2, which I realize is considered "slightly risky" or
so....

Thanks,
Dale

-- 
Dale E. Martin, Clifton Labs, Inc.
Senior Computer Engineer
dmartin@cliftonlabs.com
http://www.cliftonlabs.com
pgp key available
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/