Re: [beta patch] SSE copy_page() / clear_page()

Manfred Spraul (manfred@colorfullife.com)
Wed, 14 Feb 2001 23:37:59 +0100

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Tom Sightler: "Re: Samba performance / zero-copy network I/O"
Previous message: george anzinger: "[ANNOUNCEMENT] High resolution timer mailing list/ project"

This is a multi-part message in MIME format.
--------------2BD94515AC96F0EBA3546C45
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

I have another idea for sse, and this one is far safer:

only use sse prefetch, leave the string operations for the actual copy.
The prefetch operations only prefetch, don't touch the sse registers,
thus neither any reentency nor interrupt problems.

I tried the attached hack^H^H^H^Hpatch, and read(fd, buf, 4000000) from
user space got 7% faster (from 264768842 cycles to 246303748 cycles,
single cpu, noacpi, 'linux -b', fastest time from several thousand
runs).

The reason why this works is simple:

Intel Pentium III and P 4 have hardcoded "fast stringcopy" operations
that invalidate whole cachelines during write (documented in the most
obvious place: multiprocessor management, memory ordering)

The result is a very fast write, but the read is still slow.

--
	Manfred
--------------2BD94515AC96F0EBA3546C45
Content-Type: text/plain; charset=us-ascii;
 name="patch-sse-prefetchnta"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="patch-sse-prefetchnta"

--- 2.4/mm/filemap.c	Wed Feb 14 10:51:42 2001
+++ build-2.4/mm/filemap.c	Wed Feb 14 22:11:44 2001
@@ -1248,6 +1248,20 @@
 		size = count;
 
 	kaddr = kmap(page);
+	if (size > 128) {
+		int i;
+		__asm__ __volatile__(
+			"mov %1, %0\n\t"
+			: "=r" (i)
+			: "r" (kaddr+offset)); /* load tlb entry */
+		for(i=0;i<size;i+=64) {
+			__asm__ __volatile__(
+				"prefetchnta (%1, %0)\n\t"
+				"prefetchnta 32(%1, %0)\n\t"
+				: /* no output */
+				: "r" (i), "r" (kaddr+offset));
+		}
+	}
 	left = __copy_to_user(desc->buf, kaddr + offset, size);
 	kunmap(page);
 	

--------------2BD94515AC96F0EBA3546C45--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Next message: Tom Sightler: "Re: Samba performance / zero-copy network I/O"
Previous message: george anzinger: "[ANNOUNCEMENT] High resolution timer mailing list/ project"