[PATCH] 2.4.9 Fix 'noac' and 'sync' mount flags

Trond Myklebust (trond.myklebust@fys.uio.no)
Sat, 8 Sep 2001 18:24:11 +0200


Hi,

The following patch has been requested by people who are interested in
using databases over NFS. Basically they argue that our implementation
of the 'noac' mount flag differs from that of other *NIX in that we
still allow asynchronous writes. As 'noac' is supposed to mean no
attribute caching, caching writes on the client is a violation.

For various reasons, they also argue against adapting the 'O_SYNC'
style behaviour of firing off asynchronous writes, and then using
generic_osync_inode() to sync the data.
Basically they want 'noac' to ensure that only 1 NFS_STABLE write is
going down the wire at any point in time as this is the standard *NIX
behaviour.

After having studied the problem, we've come to the conclusion
therefore that the best solution is the following, in which we simply
use nfs_writepage_sync(). That limits the effective wsize to being <=
PAGE_CACHE_SIZE, but it's the only way of avoiding races.

Incidentally, the patch also fixes a typo in nfs_writepage() in which
I have used the rsize rather than wsize in deciding whether or not to
use asynchronous writes.

Cheers,
Trond

diff -u --recursive --new-file linux-2.4.9/fs/nfs/inode.c linux-2.4.9-sync/fs/nfs/inode.c
--- linux-2.4.9/fs/nfs/inode.c Thu Aug 16 18:39:37 2001
+++ linux-2.4.9-sync/fs/nfs/inode.c Thu Aug 30 09:13:33 2001
@@ -312,6 +312,7 @@
if (data->flags & NFS_MOUNT_NOAC) {
data->acregmin = data->acregmax = 0;
data->acdirmin = data->acdirmax = 0;
+ sb->s_flags |= MS_SYNCHRONOUS;
}
server->acregmin = data->acregmin*HZ;
server->acregmax = data->acregmax*HZ;
diff -u --recursive --new-file linux-2.4.9/fs/nfs/write.c linux-2.4.9-sync/fs/nfs/write.c
--- linux-2.4.9/fs/nfs/write.c Thu Aug 16 18:39:37 2001
+++ linux-2.4.9-sync/fs/nfs/write.c Thu Aug 30 09:31:37 2001
@@ -288,7 +288,7 @@
goto out;
do_it:
lock_kernel();
- if (NFS_SERVER(inode)->rsize >= PAGE_CACHE_SIZE) {
+ if (NFS_SERVER(inode)->wsize >= PAGE_CACHE_SIZE && !IS_SYNC(inode)) {
err = nfs_writepage_async(NULL, inode, page, 0, offset);
if (err >= 0)
err = 0;
@@ -1031,7 +1031,7 @@
* If wsize is smaller than page size, update and write
* page synchronously.
*/
- if (NFS_SERVER(inode)->wsize < PAGE_SIZE)
+ if (NFS_SERVER(inode)->wsize < PAGE_CACHE_SIZE || IS_SYNC(inode))
return nfs_writepage_sync(file, inode, page, offset, count);

/*

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/