Re: [SMP] 2.4.5-ac13 memory corruption/deadlock?

Rico Tudor (rico-linux-kernel@patrec.com)
Tue, 19 Jun 101 15:59:26 -0500 (CDT)


Are you sure about bad memory?

Single-bit errors will be corrected; double-bit errors will generate NMI.
You can also find memory errors with an exerciser. Unfortunately,
trusty memtest86 bombs on my ServerWorks machine. Instead I use

http://www.qcc.sk.ca/~charlesc/software/memtester/

which runs in user-mode. I diagnosed thermal problems by running
this utility. Within 3 minutes of cold start, it raised main memory
temperature sufficiently to induce a hard error, which was detected
simultaneously by it and the hardware (NMI taken by kernel).

Can you recommend one of your (shorter) tests for me to try?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/