Re: Hashing and directories

Pavel Machek (pavel@suse.cz)
Sat, 1 Jan 2000 02:02:13 +0000


Hi!

> I was hoping to point out that in real life, most systems that
> need to access large numbers of files are already designed to do
> some kind of hashing, or at least to divide-and-conquer by using
> multi-level directory structures.

Yes -- because their workaround kernel slowness.

I had to do this kind of hashing because kernel disliked 70000 html
files (copy of train time tables).

BTW try rm * with 70000 files in directory -- command line will overflow.

> A particular reason for this, apart from filesystem efficiency,
> is to make it easier for people to find things, as it is usually
> easier to spot what you want amongst a hundred things than among
> a thousand or ten thousand.

Yes? Easier to type cat timetab1/2345 that can timetab12345? With bigger
command line size, putting i into *one& directory is definitely easier.

> A couple of practical examples from work here at Netcom UK (now
> Ebone :), would be say DNS zone files or user authentication data.
> We use Solaris and NFS a lot, too, so large directories are a bad
> thing in general for us, so we tend to subdivide things using a
> very simple scheme: taking the first letter and then sometimes
> the second letter or a pair of letters from the filename. This
> actually works extremely well in practice, and as mentioned above
> provides some positive side-effects.

Positive? Try listing all names that contain "linux" with such case. I'll
do ls *linux*. You'll need ls */*linux* ?l/inux* li/nux*. Seems ugly to
me.
Pavel

-- 
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/