this is with 2.4.9-ac12 non preempt and without the patch on an athlon 850 
k7-2 . niced to -20.  It was compiled with gcc 3.0.2  no CFLAGS
Athlon test program $Id: fast.c,v 1.6 2000/09/23 09:05:45 arjan Exp $ 
clear_page() tests 
clear_page function 'warm up run'        took 21392 cycles per page
clear_page function '2.4 non MMX'        took 14020 cycles per page
clear_page function '2.4 MMX fallback'   took 14778 cycles per page
clear_page function '2.4 MMX version'    took 15416 cycles per page
clear_page function 'faster_clear_page'  took 4396 cycles per page
clear_page function 'even_faster_clear'  took 4165 cycles per page
copy_page() tests 
copy_page function 'warm up run'         took 19788 cycles per page
copy_page function '2.4 non MMX'         took 23562 cycles per page
copy_page function '2.4 MMX fallback'    took 23174 cycles per page
copy_page function '2.4 MMX version'     took 20422 cycles per page
copy_page function 'faster_copy'         took 10132 cycles per page
copy_page function 'even_faster'         took 9449 cycles per page
Using optimizations with this code actually slows it down for me.  perhaps 
this means something.  When using asm code, perhaps it's better to not use 
any compiler flags in the kernel config.  The patch is still needed of course 
to stop user programs from crashing the system, but maybe in the kernel for 
asm sources you shouldn't try any compiler flags and see if that increases 
performance.  this could all be a gcc 3.x'ism though.  
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/