Re: [Ext2-devel] disk throughput

Mike Fedyk (mfedyk@matchmail.com)
Sun, 4 Nov 2001 19:32:32 -0800


On Sun, Nov 04, 2001 at 06:13:19PM -0800, Andrew Morton wrote:
> I've been taking a look at one particular workload - the creation
> and use of many smallish files. ie: the things we do every day
> with kernel trees.
>
> There are a number of things which cause linux to perform quite badly
> with this workload. I've fixed most of them and the speedups are quite
> dramatic. The changes include:
>
> - reorganise the sync_inode() paths so we don't do
> read-a-block,write-a-block,read-a-block, ...
>

Why can't the elevator do that? I'm all for better performance, but if
the elevator can do it, then it will work for other file systems too.

> - asynchronous preread of an ext2 inode's block when we
> create the inode, so:
>
> a) the reads will cluster into larger requests and
> b) during inode writeout we don't keep stalling on
> reads, preventing write clustering.
>

This may be what is inhibiting the elevator...

> The above changes are basically a search-and-destroy mission against
> the things which are preventing effective writeout request merging.
> Once they're in place we also need:
>
> - Alter the request queue size and elvtune settings
>

What settings are you suggesting? The 2.4 elevator queue size is an
order of magnatide larger than 2.2...

>
> The time to create 100,000 4k files (10 per directory) has fallen
> from 3:09 (3min 9second) down to 0:30. A six-fold speedup.
>

Nice!

>
> All well and good, and still a WIP. But by far the most dramatic
> speedups come from disabling ext2's policy of placing new directories
> in a different blockgroup from the parent:
[snip]
> A significant thing here is the improvement in read performance as well
> as writes. All of the other speedup changes only affect writes.
>
> We are paying an extremely heavy price for placing directories in
> different block groups from their parent. Why do we do this, and
> is it worth the cost?
>

My God! I'm no kernel hacker, but I would think the first thing you would
want to do is keep similar data (in this case similar because of proximity
in the dir tree) as close as possible to reduce seeking...

Is there any chance that this will go into ext3 too?

Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/