Re: 2.4.19rc2aa1 i_size atomic access

Andrea Arcangeli (andrea@suse.de)
Mon, 29 Jul 2002 20:47:58 +0200


On Mon, Jul 29, 2002 at 11:37:30AM -0700, Bob Miller wrote:
> On Tue, Jul 23, 2002 at 07:47:12PM +0200, Andrea Arcangeli wrote:
> > On Tue, Jul 23, 2002 at 07:08:07PM +0200, Andrea Arcangeli wrote:
> > > So while merging it I rewrote it this way (I also change the type of the
> >
> > here it is the final full patch:
> >
>
> Stuff deleted...
>
> > diff -urNp race/include/asm-i386/system.h race-fix/include/asm-i386/system.h
> > --- race/include/asm-i386/system.h Tue Jul 23 18:46:44 2002
> > +++ race-fix/include/asm-i386/system.h Tue Jul 23 18:47:10 2002
> > @@ -143,6 +143,8 @@ struct __xchg_dummy { unsigned long a[10
> > #define __xg(x) ((struct __xchg_dummy *)(x))
> >
> >
> > +#ifdef CONFIG_X86_CMPXCHG
> > +#define __ARCH_HAS_GET_SET_64BIT 1
> > /*
> > * The semantics of XCHGCMP8B are a bit strange, this is why
> > * there is a loop and the loading of %%eax and %%edx has to
> > @@ -167,7 +169,7 @@ static inline void __set_64bit (unsigned
> > "lock cmpxchg8b (%0)\n\t"
> > "jnz 1b"
> > : /* no outputs */
> > - : "D"(ptr),
> > + : "r"(ptr),
> > "b"(low),
> > "c"(high)
> > : "ax","dx","memory");
> > @@ -197,6 +199,32 @@ static inline void __set_64bit_var (unsi
> > __set_64bit(ptr, (unsigned int)(value), (unsigned int)((value)>>32ULL) ) : \
> > __set_64bit(ptr, ll_low(value), ll_high(value)) )
> >
>
> Stuff deleted...
> >
> > +/*
> > + * NOTE: in a 32bit arch with a preemptable kernel and
> > + * an UP compile the i_size_read/write must be atomic
> > + * with respect to the local cpu (unlike with preempt disabled),
> > + * but they don't need to be atomic with respect to other cpus like in
> > + * true SMP (so they need either to either locally disable irq around
> > + * the read or for example on x86 they can be still implemented as a
> > + * cmpxchg8b without the need of the lock prefix). For SMP compiles
> > + * and 64bit archs it makes no difference if preempt is enabled or not.
> > + */
> > +static inline loff_t i_size_read(struct inode * inode)
> > +{
> > +#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
> > +#ifdef __ARCH_HAS_GET_SET_64BIT
> > + return (loff_t) get_64bit((unsigned long long *) &inode->i_size);
> > +#else
> > + loff_t i_size;
> > + int v1, v2;
> > +
> > + /* Retry if i_size was possibly modified while sampling. */
> > + do {
> > + v1 = inode->i_size_version1;
> > + rmb();
> > + i_size = inode->i_size;
> > + rmb();
> > + v2 = inode->i_size_version2;
> > + } while (v1 != v2);
> > +
> > + return i_size;
> > +#endif
> > +#elif BITS_PER_LONG==64 || !defined(CONFIG_SMP)
> > + return inode->i_size;
> > +#endif
> > +}
> > +
>
> Andrea,
>
> Sorry for responding to this thread so late (I have been on holiday)...
> I don't like creating __ARCH_HAS_GET_SET_64BIT and then doing conditional
> code based on it. I believe that get_64bit() and set_64bit() should
> always be defined and used. On x86 with cmpxchg8b and SMP or PREEMPT
> get_64bit() and set_64bit() use cmpxchg8b. On 386 and 486 with SMP
> or PREEMPT create "safe" versions i.e.:
>
> static inline void set_64bit(unsigned long long * ptr, unsigned long long value)
> {
> lock_kernel();
> *ptr = value;
> unlock_kernel();
> }
>
> static inline unsigned long long get_64bit(unsigned long long * ptr)
> {
> unsigned long long retval;
>
> lock_kernel();
> retval = *ptr;
> unlock_kernel
>
> return reval;
> }
>
> I know BKL sucks but how many SMP/PREEMPT 386/486 boxes are really out there?
>
> And for all non SMP or PREEMPT do:
>
> static inline void set_64bit(unsigned long long * ptr, unsigned long long value)
> {
> *ptr = value;
> }
>
> static inline unsigned long long get_64bit(unsigned long long * ptr)
> {
> return *ptr;
> }
>
> Other arches are free to do the "right thing" for them selfs.

well, the point are exactly the other archs, i.e. s390. Why should s390
lose total scalability during read/writes with a big kernel lock or
similar, when they can use the ordered version? (maybe it's even
possible to read/write 64bit atomically on s390 like we do on 586+, but
still some 32bit arch may not provide that functionality and they
will definitely prefer the ordered read/write using sequence numbers
rather than a big kernel lock or global lock) That's the whole point of
the __ARCH_HAS_GET_SET_64BIT information.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/