Re: [patch 12/16] fix race between writeback and unlink

Andrew Morton (akpm@zip.com.au)
Sat, 01 Jun 2002 15:25:56 -0700


William Lee Irwin III wrote:
>
> Linus Torvalds wrote:
> >> The general VFS layer really shouldn't have assigned that strogn a meaning
> >> to "i_nlink" anyway, it's not for the VFS layer to decide (and it only
> >> causes problems for any non-UNIX-on-a-disk filesystems).
>
> On Sat, Jun 01, 2002 at 12:19:36PM -0700, Andrew Morton wrote:
> > Yes, I suspect all the inode refcounting, locking, I_FREEING, I_LOCK, etc
> > could do with a spring clean. Make it a bit more conventional. I'll
> > discuss with Al when he resurfaces.
>
> I'm somewhat concerned about the protection of ->i_size, since that
> appears to be accessed in generic_file_read() without any protection
> against writers to the field. From a quick glance at current 2.5 (it
> looks like 2.4 has this too) it looks like it's written to by
> vmtruncate() through notify_change() with the ->i_sem and BKL held at
> the moment, but generic_file_read() doesn't take either before reading
> it, and there may be still other writers.

truncate and write change i_size, under i_sem. The i_size test on
the read path doesn't really need to be there, I suspect. It handles
the window where i_size has been decreased by truncate but the filesystem
hasn't finished truncating the blocks yet. It also optimises reads
outside the end of file - no point in calling into the filesystem
to try to map blocks which aren't there.

> I also don't see the anything
> like read_barrier_depends() for lockless algorithms or any atomic reads.
> Even on machines with extremely strong memory consistency models like
> i386, as loff_t is long long, it would seem possible to catch a partial
> update and see an entirely bogus ->i_size value.

That's true. sys_stat() also could see a confusing intermediate value.
A while back Ingo and Linus were tossing around possible solutions to
this based on x86 compare-and-exchange operations, but nothing conclusive
came out of it. It's a "known bug".

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/