Re: Wow! Is memory ever cheap!

Edgar Toernig (froese@gmx.de)
Wed, 09 May 2001 23:04:18 +0200


Larry McVoy wrote:
>
> Let's review: ECC is nice, but it doesn't solve all data corruption
> problems. Applications which do their own end to end data integrity
> checks will catch many more error cases than what ECC catches.

I think you have a wrong idea why the ECC is there. ECC deals with
the inherit shortcommings of DRAM.

DRAMs are not perfect. They have a probability to lose a bit.
Normally this probability is low enough to live with it. Lets say
you have a system with 1MByte and let's say the probability for a
single bit error is around 1 error in 100 years. Good enough.
Now put 1GByte in the system. You'll get a probability of 10 errors
per year. Maybe good enough for a Windows box but not acceptable
for your server. So you put in ECC to bring this probability back
into reasonable numbers. ECC can correct the single bit errors.
You only have to deal with double bit errors. Chance for them is
much much lower.

Sure, it doesn't solve all data corruption problems - only simple
errors in DRAMs. But it makes systems with huge amount of RAM staying
up alive much longer. And btw, your integrity checks over data will
not protect against a corrupted kernel or application...

Ciao, ET.

PS: Just let your app run long enough. I'm sure it will detect a
checksum error some day ;-)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/