Re: COW fs (Re: Editing-in-place of a large file)

VDA (VDA@port.imtp.ilyichevsk.odessa.ua)
Mon, 10 Sep 2001 12:28:51 +0300


JR> I've tried this idea. I did an MD5 of every block (4KB) in a partition
JR> and counted the number of blocks with the same hash. Only about 5-10% of
JR> blocks on several filesystem were actually duplicates. This might be
JR> better if you reduced the block size to 512 bytes, but there's a
JR> question of how much extra space filesystem structures would then take
JR> up.

JR> Basically, it didn't look like compressing duplicate blocks would
JR> actually be worth the extra structures or CPU.

JR> On the other hand, a COW fs would be excellent for making file copying
JR> much quicker. You can do things like copying the linux kernel tree using
JR> 'cp -lR', but the files do not act as if they are unique copies - and
JR> I've been bitten many times when I forgot this. If you had COW, you
JR> could just copy the entire tree and forget about the fact they're
JR> linked.

Yeah, I'm mostly thinking about this kind of COW fs usage. You may copy
gigabytes in the instant and don't bother about tracking duplicate
files ("zero blocks left??? where's the hell I copied that .mpg's???").

Now, sometimes we use hardlinks as "poor man's COW fs", but
I bet it's error prone. Every now and then you forget it's a
hardlinked kernel tree and start happily hacking in it... :-(

A "compressor" which hunts and merges duplicate blocks is a bonus,
not a primary tool.

-- 
Best regards,
VDA
mailto:VDA@port.imtp.ilyichevsk.odessa.ua
http://port.imtp.ilyichevsk.odessa.ua/vda/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/