Re: nfs MAP_SHARED corruption fix

Marcelo Tosatti (marcelo@conectiva.com.br)
Wed, 9 May 2001 21:00:26 -0300 (BRT)


On Thu, 10 May 2001, Andrea Arcangeli wrote:

> If some page wasn't yet visible in the dirty_pages list by the time
> __sync_one started, we'll find I_DIRTY_PAGES set. This is enforced by
> the locking order (sync_one first clears the I_DIRTY_PAGES and then
> it starts browsing the dirty_pages list while set_page_dirty first make the
> page visible and then marks the inode dirty).
>
> So the I_DIRTY_PAGES check guarantees that those dirty pages cannot be
> lost in iput, that was the _only_ object of the patch and that is
> certainly enough to fix the nfs fs data corruption reported.
>
> Now if you claim that munmap needs to be synchronous for nfs that's a
> completly different matter. I didn't even tried to make it synchronous.
> It is possible it has to be synchronous, even write(2) (in theory ;) has
> to behave like O_SYNC with nfs, but I'm not sure.

I suggested the removal of I_DIRTY_PAGES check because the current
behaviour of munmap seems to be synchronous (1), so I guess you _always_
want it to be synchronous.

1) nfs_wb_file() flushes the dirty data and then waits for completion.

Trond?

> Another thing (completly unrelated to the above issues) that I noticed
> while looking over this nfs code is that the __sync_one() for example
> called by generic_file_write(O_SYNC) will recall fdatasync but no nfs_wb_all
> is put before the fdatawait, and I'm not sure that the nfs_sync_page
> called by the fdatawait is enough to rapidly flush the writepaged stuff
> to the nfs server. nfs_sync_page apparently only cares about speculative
> reads, not at all about committing writebacks. It would look much saner
> to me if nfs_sync_page also does a nfs_wb_all() on the inode, so that
> the ->sync_page callback gets the same semantics it has for the real
> filesystems.

Looks sane and will probably makes things faster.

Again, Trond? :)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/