Re: Journaling pointless with today's hard disks?

Rob Landley (landley@trommello.org)
Mon, 26 Nov 2001 11:59:08 -0500


On Sunday 25 November 2001 04:14, Chris Wedgwood wrote:

>
> P.S. Write-caching in hard-drives is insanely dangerous for
> journalling filesystems and can result in all sorts of nasties.
> I recommend people turn this off in their init scripts (perhaps I
> will send a patch for the kernel to do this on boot, I just
> wonder if it will eat some drives).

Anybody remember back when hard drives didn't reliably park themselves when
they cut power? This isn't something drive makers seem to pay much attention
to until customers scream at them for a while...

Having no write caching on the IDE side isn't a solution either. The problem
is the largest block of data you can send to an ATA drive in a single command
is smaller than modern track sizes (let alone all the tracks under the heads
on a multi-head drive), so without any sort of cacheing in the drive at all
you add rotational latency between each write request for the point you left
off writing to come back under the head again. This will positively KILL
write performance. (I suspect the situation's more or less the same for read
too, but nobody's objecting to read cacheing.)

The solution isn't to avoid write cacheing altogether (performance is 100%
guaranteed to suck otherwise, for reasons unrelated to how well your hardware
works but to legacy request size limits in the ATA specification), but to
have a SMALL write buffer, the size of one or two tracks to allow linear ATA
write requests to be assembled into single whole-track writes, and to make
sure the disks' electronics has enough capacitance in it to flush this buffer
to disk. (How much do capacitors cost? We're talking what, an extra 20
miliseconds? The buffer should be small enough you don't have to do that
much seeking.)

Just add an off-the-shelf capacitor to your circuit. The firmware already
has to detect power failure in order to park the head sanely, so make it
flush the buffers along the way. This isn't brain surgery, it just wasn't a
design criteria on IBM's checklist of features approved in the meeting.
(Maybe they ran out of donuts and adjourned the meeting early?)

Rob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/