Re: [PATCH] 64 bit scsi read/write

Daniel Phillips (phillips@bonn-fries.net)
Mon, 23 Jul 2001 16:41:24 +0200


On Sunday 22 July 2001 05:52, Albert D. Cahalan wrote:
> Alexander Griesser writes:
> > On Sun, Jul 15, 2001 at 09:08:41PM -0400, you wrote:
> >> In a tree-structured filesystem, checksums on everything would
> >> only cost you space similar to the number of pointers you have.
> >> Whenever a non-leaf node points to a child, it can hold a checksum
> >> for that child as well. This gives a very reliable way to spot
> >> filesystem errors, including corrupt data blocks.
> >
> > Hmm, maybe this is crap, but: If the checksum-calculation for one
> > node fails, wouldn't that mean, that the data in this node, is not
> > to be trusted? therefore also the checksum of this node could be
> > corrupted and so the node, 2 hops away, can't be validated with
> > 100% certitude...
>
> If I understand you right ("one"? "this"?), yes and we want that.
>
> Node 1 has children 2, 3, and 4.
> Node 3 has children 5, 6, and 7.
> Node 6 has children 8, 9, and 10. (children might be data blocks)
>
> To have a child is to have a checksum+pointer pair.
>
> If node 3 contains a corrupt pointer to node 6, then it is unlikely
> that the checksum will match. So node 6 is bad, along 8, 9, and 10.
> (actually we might not be able to know that 8, 9, and 10 exist)
> This result is wonderful, since it prevents interpreting random
> disk blocks as useful data.
>
> If node 3 contains a corrupt checksum for node 6, same thing. Damn.
> This case should be rare, since why for node 1 have a checksum
> that is OK for node 3 if node 3 has corruption?
>
> If node 6 itself is corrupt, same thing. Good, we are stopped from
> using bad data.

I agree that your suggestion will work and that doubling the size of
the metadata isn't an enormous cost, especially if you'd already
compressed it using extents. On the other hand, sometimes I just feel
like trusting the hardware a little. Both atomic-commit and
journalling strategies take care of normal failure modes, and the disk
hardware is supposed to flag other failures by ecc'ing each sector on
disk.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/