Alan Cox wrote:
>On Sat, 2002-10-26 at 20:22, Manfred Spraul wrote:
>
>
>>kmalloc spends a large part of the total execution time trying to find
>>the cache for the passed in size.
>>
>>What about the attached patch (against 2.5.44-mm5)?
>>It uses fls jump over the caches that are definitively too small.
>>
>>
>
>Out of curiousity how does fls compare with finding the right cache by
>using a binary tree walk ? A lot of platforms seem to use generic_fls
>which has a lot of conditions in it and also a lot of references to just
>computed values that look likely to stall
>
>
Binary tree walk means 4 unpredictable branches and at least i386 can
use bsrl for a fast fls().
Patch is attached.
--
Manfred
--------------040808040707070602020605
Content-Type: text/plain;
name="patch-fls"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="patch-fls"
--- 2.5/include/asm-i386/bitops.h Sun Sep 22 06:25:12 2002
+++ build-2.5/include/asm-i386/bitops.h Sun Oct 27 11:04:57 2002
@@ -414,11 +414,22 @@
return word;
}
-/*
+/**
* fls: find last bit set.
+ * @x: The word to search
+ *
*/
-#define fls(x) generic_fls(x)
+static inline int fls(int x)
+{
+ int r;
+
+ __asm__("bsrl %1,%0\n\t"
+ "jnz 1f\n\t"
+ "movl $-1,%0\n"
+ "1:" : "=r" (r) : "g" (x));
+ return r+1;
+}
#ifdef __KERNEL__
--------------040808040707070602020605--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/