Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists

Davide Libenzi (davidel@xmailserver.org)
Fri, 12 Oct 2001 18:54:36 -0700 (PDT)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Keith Owens: "Re: crc32 cleanups"
Previous message: Jeff Garzik: "Re: crc32 cleanups"

On Sat, 13 Oct 2001, Paul Mackerras wrote:

> Linus Torvalds writes:
>
> > So how about just going all the way and calling it what it is:
> > "data_dependent_read_barrier()", with a notice in the PPC docs about how
> > the PPC cannot speculate reads anyway, so it's a no-op.
>
> To set the record straight, the PPC architecture spec says that
> implementations *can* speculate reads, but they have to make it look
> as if dependent loads have a read barrier between them.
>
> And it isn't speculated reads that are the problem in the alpha case,
> it's the fact that the cache can reorder invalidations that are
> received from the bus. That's why you can read the new value of p but
> the old value of *p on one processor after another processor has just
> done something like a = 1; wmb(); p = &a.

I think that necessary condition to have a reordering of bus sent
invalidations is to have a partitioned cache architecture.
I don't see any valid reason for a cache controller of a linear cache
architecture to reorder an invalidation stream coming from a single cpu.
If the invalidation sequence for cpu1 ( in a linear cache architecture )
is 1a, 1b, 1c, ... this case can happen :

1a, 2a, 1b, 2b, 2c, 1c, ...
| | |
.........................

but not this :

1a, 2a, 2b, 1c, 2c, 1b, ...
| | |
...............><........

> My impression from what Paul McKenney was saying was that on most
> modern architectures other than alpha, dependent loads act as if they
> have a read barrier between them. What we need to know is which
> architectures specify that behaviour in their architecture spec, as
> against those which do that today but which might not do it tomorrow.

Uhmm, an architecture that with a = *p; schedule the load of *p before
the load of p sounds screwy to me.
The problem is that even if cpu1 schedule the load of p before the
load of *p and cpu2 does a = 1; wmb(); p = &a; , it could happen that
even if from cpu2 the invalidation stream exit in order, cpu1 could see
the value of p before the value of *p due a reordering done by the
cache controller delivering the stream to cpu1.

- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Keith Owens: "Re: crc32 cleanups"
Previous message: Jeff Garzik: "Re: crc32 cleanups"