> These calculations affect us all. They show us what way computing
> will evolve under the price and technology pressures. The calculations
> are only looking to 2006, but that's what they show. For example
> if we think about a 5PB system made of 5000 disks of 1TB each in a GE
> net, we calculate the aggregate bandwidth available in the topology as
> 50GB/s, which is less than we need in order to keep the nodes fed
> at the rates they could be fed at (yes, a few % loss translates into
> time and money). To increase available bandwidth we must have more
> channels to the disks, and more disks, ... well, you catch my drift.
>
> So, start thinking about general mechanisms to do distributed storage.
> Not particular FS solutions.
Distributed systems will need somewhat different solutions, because
they are fundamentally different. Existing fs'es like ext2 is built
around a single-node assumption. I claim that making a new fs from
scratch for the distributed case is easier than tweaking ext2
and 10-20 other existing fs'es to work in such an environment.
Making a new fs from scratch isn't such a big deal after all.
To make a historical parallel:
Data used to be stored on sequential media like tapes (or
even stacks of punched cards) filesystems were developed
for tapes. Then they made disks.
Using a disk as a tape with the existing tape-fs'es
worked, but didn't give much benefit. So we got something
new - block-based filesystems designed to take advantage
of the new random-access media.
The case of distributed storage is similiar, it is fundamentally
different from the one-node case just as random-access media
were different from sequential.
I think a new design that considers both the benefits and
problems of many nodes will be much better than trying to
patch the existing fs'es. An approach that starts with
throwing away the thousand-fold speedup provided by caching
isn't very convincing.
If you merely proposed making the VFS and existing fs'es
cache-coherent,then I'd agree it might work well, but
it'd be a _lot_ of work. Which is no problem
if you volunteer to do the work. But simplification
by throwing away caching _will_ be too slow, it certainly
don't fit the idea of getting more bandwith.
More bandwith won't help if you throw all of it and then some
away on massive re-reading of data.
Wanting a generic mechanism instead of a special fs
might be the way to go, but it'd be a generic mechanism
used by a bunch of new fs'es designed to work distributed.
There will probably be different needs for which people
will build different distributed fs'es. So a
"VDFS" makes sense for those fs'es, putting the common stuff
in one place. But I am sure the VDFS will contain cache
coherency calls for dropping pages from cache when
necessary, instead of dropping the cache unconditionally
in every case.
Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/