Re: 2.4.21 reiserfs oops

Nix (nix@esperi.demon.co.uk)
Tue, 24 Jun 2003 21:34:55 +0100


On Tue, 24 Jun 2003, Oleg Drokin moaned:
> Hello!
>
> On Mon, Jun 23, 2003 at 11:16:27PM +0100, Nix wrote:
>
>> >> Jun 22 13:52:42 loki kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000001
>> > This is very strange address to oops on.
>> I'll say! Looks almost like it JMPed to a null pointer or something.
>
> No, if it'd jumped to a NULL pointer, we'd see 0 in EIP.

JMPed to ((long)NULL)+1 or something then :) the fact remains that it's
not somewhere that even a memory error would make us likely to jump to.

>> >> Jun 22 13:52:43 loki kernel: EIP: 0010:[<c0092df4>] Not tainted
>> > And the EIP is prior to kernel start which is also very strange.
>> > On the other hand the address c0192df4 is somewhere inside reiserfs code,
>> > so it looks like a single bit error, I'd say.
>> I think it unlikely to be RAM problems given that the problem happened
>> shortly after upgrading to 2.4.21; this was about half a day after I
>> rebooted it because it threw a pile of never-seen-again, un-syslogged
>> SCSI abort errors at me (sym53c875); and *that* was a few minutes after
>> I rebooted into 2.4.21 for the first time.
>
> Hm, so first there were some scsi problems and then reiserfs oops?

Different boots. I upgraded, the first boot crashed within five minutes
with weird SCSI errors, so I rebooted again and this happened six hours
later.

I'm willing to write off the SCSI errors to the shock effect of having
just been powered down for the first time in a year (the shutdown
scripts didn't quite work and the reset button is disconnected).

>> Actually since the RAM is good, I see no good reason for this to happen.
>> (actually I see no good reason for valid code before _text, either).
>
> I wonder if 2.4.21 constantly crashes like that for you, then?

No obvious sign of it:

9:34pm up 1 day 22:30, 14 users, load average: 0.09, 0.12, 0.16

(it is of course waiting until I am hundreds of miles away. *Then* it'll
crash.)

-- 
`It is an unfortunate coincidence that the date locarchive.h was
 written (in hex) matches Ritchie's birthday (in octal).'
               -- Roland McGrath on the libc-alpha list
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/