Pentium SSE prefetcht0 instruction... How do you make it work

Bulent Abali (abali@us.ibm.com)
Thu, 27 Sep 2001 14:40:55 -0400


I have a system with a large external L3 cache. Uses Pentium III
Coppermine (stepping 3) processor. L3 block size is relatively long. I
am trying to increase the L3 performance by prefetching L3 lines. I
thought prefetcht0, t1, t2, nta instructions may help. These instructions
prefetch L1/L2 lines in to the processor therefore they should prefetch L3
as well. However I see no benefit. It is as if prefetch never make it to
the front side bus. I took the example of arch/i386/lib/mmx.c to implement
the asm level routines. Perhaps I am overlooking something. I'd
appreciate any suggestions. /Bulent

//This should prefetch an L2 line at addr (hence L3 line prefetch)
inline void L3_prefetch (char * addr)
{
asm volatile("prefetcht1 %0" :: "m" (addr));
}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/