Re: [Ext2-devel] disk throughput

Linus Torvalds (torvalds@transmeta.com)
Mon, 5 Nov 2001 18:10:47 -0800 (PST)


On Mon, 5 Nov 2001, Alexander Viro wrote:
>
> Depends on the naming scheme used by admin, for one thing. Linus, I seriously
> think that getting the real-life traces to play with would be a Good Thing(tm)
> - at least that would allow to test how well would heuristics of that kind do.

Well, we hae some real-life traces. Those particular real-life traces show
that the "switch cylinder groups on mkdir" heuristic sucks.

And you have to realize that _whatever_ we do, it will always be a
heuristic. We don't know what the right behaviour is without being able to
predict the future. Agreed?

Ok, so let's just assume that we had no heuristic in place at all, and we
were looking at improving it for slow-growth situations. I think you'd
agree that decreasing the performance of a CVS checkout five-fold would be
considered _totally_ unacceptable for a new heuristic, no?

So the current heuristic provably sucks. We have cold hard numbers, and
quite frankly, Al, there is very very little point in arguing against
numbers. It's silly. "Gimme an S, gimme a U, gimme a C, gimme a K -
S-U-C-K". The current one sucks.

So the only question on the table is not "do we stay with the current
algorithm", because I dare you, the answer to that question is very
probably a clear "NO". So there's no point in even trying to find out
_why_, in 1984, it was considered a good heuristic.

The question is: "what can we do to improve it?". Not "what arguments can
we come up with to make excuses for a sucky algorithm that clearly does
the wrong thing for real-life loads".

One such improvement has already been put on the table: remove the
algorithm, and make it purely greedy.

We know that works. And yes, we realize that it has downsides too. Which
is why some kind of hybrid is probably called for. Come up with your own
sixth sense. We know the current one is really bad for real-life loads.

I actually bet that the greedy algorithm would work wonderfully well, if
we just had run-time balancing. I bet that the main argument for
"spreading things out" is that it minimizes the risk of overflow in any
particular group, never mind performance.

And if you're stuck with the decision you make forever, minimizing risk is
a good approach.

And maybe the fundamental problem is exactly that: because we're stuck
with our decision forever, people felt that they couldn't afford to risk
doing what was very obviously the right thing.

So I still claim that we should look for short-time profit, and then try
to fix up the problems longer term. With, if required, some kind of
rebalancing.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/