Re: [reiserfs-list] ReiserFS data corruption in very simple configuration

©[@ÄØÿ¿8þÿ¿Dag Nygren (pcg@goof.com)
Sat, 29 Sep 2001 14:52:29 +0200


On Sat, Sep 29, 2001 at 12:44:59AM -0400, Lenny Foner <foner-reiserfs@media.mit.edu> wrote:
> isn't fixed by fsck? [See outcome (d) below.] I'm having difficulty
> believing how this can be possible for a non-journalling filesystem.

If you have difficulties in believing this, may I ask you how you think it
is possible for a non-journaling filesystem to prevent this at all?

> But what about written to the wrong files? See below.

What you see is most probably *old* data, not data from another (still
existing) file.

> has not happened yet. (I don't know how often reiserfs will be synced
> by default; 60 seconds? Longer? Presumably running "sync" will force

mostly like with any other filesystem (man bdflush)

> Now, we have the following possibilities for the outcome after the

> (a) Metadata correctly written, file data correctly written.

all filesystems ;)

> (b) Metadata correctly written, file data partially written.
> (E.g., one or both files might have been partially or completely
> updated.)

ext2, reiserfs.

> (c) Metadata correctly written, file data completely unwritten.
> (Neither file got updated at all.)

ext2, reiserfs.

> (d) Metadata correctly written, FILE DATA INTERCHANGED BETWEEN A AND B.

this shouldn't happen on reiserfs. however, the unwritten parts of file a can easily
contain data formerly in file b.

> (e) Metadata corrupted in some fashion, file data undefined.
> ("Undefined" means could be any of (a) through (d) above; I don't care.)

this should be prevented by journaling (of course, this won't help against
harddisk failures) on reiserfs. ext2 often has this problem, but fsck usually
can repair it. it's easy to tell metadata from filedata on ext2.

> whether we can "guarantee that all the data blocks have been written",
> but may be missing the point I was making, namely that THE BLOCKS HAVE
> BEEN WRITTEN TO THE WRONG FILES.

remember that the blocks have previous content, and reiserfs' tails
optimization means that files appended all the time (wtmp) can move around
rapidly (at least their tail).

> P.P.S. I say reset and not power-off, although I hope that this is
> moot, because I presume that the unsynced data, by virtue of being
> unsynced, is nowhere near the disk datapaths anyway.

this can make a big difference. many disks (ibm, maxtor) nowadays write
partial blocks on power outage, this gives "Uncorrectable read errors",
which is fatal, because no filesystem so far can work around this. It's
easy to repair (just rewrite the block), but would requite filesystem
feedback (hey, reisrefs, this metadata block is *garbage*).

> a reset should let the disks continue to write data out of their write
> buffers, assuming that a CPU reset doesn't flush such pending

they should, yes. OTOH, ide disks are cheap...

> not matter; is it common that disks might leave data buffered but
> unwritten for 30 seconds if there is no other disk activity? I would

no. and it doesn't make sense. but it's not forbidden or sth.

-- 
      -----==-                                             |
      ----==-- _                                           |
      ---==---(_)__  __ ____  __       Marc Lehmann      +--
      --==---/ / _ \/ // /\ \/ /       pcg@goof.com      |e|
      -=====/_/_//_/\_,_/ /_/\_\       XX11-RIPE         --+
    The choice of a GNU generation                       |
                                                         |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/