Pentium SSE prefetcht0 instruction... How do you make it work

Bulent Abali (
Thu, 27 Sep 2001 14:40:55 -0400

I have a system with a large external L3 cache. Uses Pentium III
Coppermine (stepping 3) processor. L3 block size is relatively long. I
am trying to increase the L3 performance by prefetching L3 lines. I
thought prefetcht0, t1, t2, nta instructions may help. These instructions
prefetch L1/L2 lines in to the processor therefore they should prefetch L3
as well. However I see no benefit. It is as if prefetch never make it to
the front side bus. I took the example of arch/i386/lib/mmx.c to implement
the asm level routines. Perhaps I am overlooking something. I'd
appreciate any suggestions. /Bulent

//This should prefetch an L2 line at addr (hence L3 line prefetch)
inline void L3_prefetch (char * addr)
asm volatile("prefetcht1 %0" :: "m" (addr));

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at