Re: [STATUS 2.5] October 30, 2002

Alan Cox (alan@lxorguk.ukuu.org.uk)
01 Nov 2002 16:49:48 +0000


On Fri, 2002-11-01 at 14:05, Eric W. Biederman wrote:
> When you have a correctable ECC error on a page you need to rewrite the
> memory to remove the error. This prevents the correctable error from becoming
> an uncorrectable error if another bit goes bad. Also if you have a
> working software memory scrub routine you can be certain multiple
> errors from the same address are actually distinct. As opposed to
> multiple reports of the same error.

Note that this area has some extremely "interesting" properties. For one
you have to be very careful what operation you use to scrub and its
platform specific. On x86 for example you want to do something like lock
addl $0, mem. A simple read/write isnt safe because if the memory area
is a DMA target your read then write just corrupted data and made the
problem worse not better!

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/