> Making the AGP Aperture write-back cacheable is not good from
> a performance perspective. (I can't comment on which is better
> from a Linux Kernel perspective).
Mainly I was arguing from.
- Make the common case fast.
- The common case is write-back.
- AGP is not the common case.
- AGP has performance limitations.
because physical aliasing is introduced, with the AGP aperture. So
the cache coherency protocols cannot make our lives simpler.
> An Aperture Page can be made
> cache-coherent depending on the implementation
> and the AGP 3.0 spec provides an
> architectural way of specifying and controlling these as
> well. But, by default the area is not made cache-coherent
> due to the performance loss and the lack of software to take
> advantage of it -- the two play off against each
Cache coherency is tricky, so there is some argument there.
> Making it cache-coherent causes every AGP access to
> snoop processor caches and this can be quite a hit in
> performance when you consider the predominant AGP software
> model. Most software that takes advantage of AGP is still
> using the old Intel model of uncacheable, the majority of
> data placed in the Aperture are read-only structures for the
> AGP device -- such as vertex lists, locked vertex arrays,
> and texture data. For the most part this fits the current
> paradigm of throwing textures and vertices at the graphics
> device. The only graphics area found so far that could
> benefit from a coherent aperture is video capture data which
> streams in from the graphics device and requires CPU
I hadn't thought of the snooping from the AGP side, but even then given
that the AGP aperture is a fixed region it would probably work to just
have a fixed snoop on the AGP region, and only do something when AGP
traffic comes in. Though I will buy the argument it may not be
possible to do it at full performance unless the AGP card knows
something about cache coherency. Though mostly I suspect it is a cost
If the area is purely uncacheable, then writing to that area cannot go
at full memory speed. So we should at the very least mark the region
as write-combining. This should be get the cpu putting data in there
at about 1400MB/s with PC2100, and moving data there just short of
2100MB/s. This doesn't help directly AGP performance, but it does
allow the cpu to spend it's cycles on more important things, much
I don't believe there is a memory caching attribute that would get the
data copy from the AGP aperture sped up except write-back. Which is
where I guess video capture comes in.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to email@example.com
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/