Re: [PATCH] nfs_unlink() race (was: nfs_refresh_inode: inode number mismatch)

Andreas Dilger (adilger@clusterfs.com)
Wed, 11 Jun 2003 00:16:58 -0600


On Jun 10, 2003 22:30 -0700, Linus Torvalds wrote:
> On Wed, 11 Jun 2003 viro@parcelfarce.linux.theplanet.co.uk wrote:
> > Two different dentries for the same file is obviously not a problem...
>
> It _is_ a problem. It does the wrong thing on any subsequent directory
> operation (move or unlink).
>
> Multiple aliased dentries have never been ok, unless the filesystem
> explicitly handles them and invalidates them (ie ntfs/fat kind of things).

Amusingly, we hit _exactly_ this same problem a couple of days ago
in Lustre because we were trying to use RW locks instead of the single
directory i_sem to improve large (10M files) directory lookup performance,
and it is hard to hit and even harder to detect.

We've gone back to using i_sem to protect the actual dentry instantiation
in lookup for now (we have a separate DLM for doing RW locking of the
directory for create/rename/unlink), but will be looking at ways to allow
this in the filesystem again at some time in the future. Allowing RW
locking for local filesystem lookups would probably also be a win.

My 5-minute theory to fix this is to re-check the inode dentry alias list
when you are doing d_instantiate() (with dcache lock held) for another
dentry with the same hash (and further the same name if the hashes match),
but that would behave poorly when there are large numbers of links to a
single file. It _might_ be safe to check the first N entries on the
i_dentry list, where N ~= num_cpus, assuming that we could only have N
racy dentry lookups going at one time. Even so, the boost of allowing
multiple parallel lookups for huge directories probably beats out the
extra searching slowdown for the uncommon many-links-to-inode case.

Cheers, Andreas

--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/