Re: [PATCH] task_struct + kernel stack colouring ...

Shuji YAMAMURA (yamamura@flab.fujitsu.co.jp)
Wed, 05 Dec 2001 21:22:00 +0900


Hi,

Your patch achieved very good performance in our benchmark test, but
I would like to say the following two points about your implementation
for kernel stack colouring.

1. Your patch could implement coloured kernel stack, but not
reduce cache line conflicts.
2. I couldn't see the performance difference whether stack colouring
is done or not.

[1] get_stack_jitter() in arch/i386/kernel/process.c selects 3 bits
from the cache line index bits. So, cache conflicts still occurs at
shifted line.

Suppose the cache profile is 256KB 4-way, the address distance between
the data on a some block and the data on the other block on the same
set is multiple of 64KB. This means, the lower 16 bits of such
addresses are always same.
The patch uses 3 bits, from bit position 13 to 15, the data of set
has always the same colour as describe above. From the viewpoint of
cache miss reduction this colouring has no effect.

The patch which I have posted before uses 3bits, from bit
position 18 to 20 (1MB 4-way L2-cache) for task_structs colouring.

I suggest you the following two ways for stack colouring.

(a) Using upper bits than the cache index bits.(ex. On 256KB L2-cache
system, STACK_SHIFT_BITS should be 16(11 index bits + 5 offset
bits).

(b) Using modulo operation for colouring.

in get_stack_jitter() (arch/i386/kernel/process.c)
+#define NUM_COLOUR 9 /* the number of colouring (an odd number) */
static inline unsigned long get_stack_jitter(struct task_struct *p)
{
- return ((TSK_TO_KSTACK(p) >> STACK_SHIFT_BITS) & STACK_COLOUR_MASK) << L1_CACHE_SHIFT;
+ return ((TSK_TO_KSTACK(p) >> STACK_SHIFT_BITS) % NUM_COLOUR) << L1_CACHE_SHIFT;
}

[2] I measured the effects of your patch on 4-way PIII-Xeon with 1MB
L2-Cache systems using web-bench (apache 1.3.19), and also measured
the performance of the modified version(*), which uses 3 bits, from
bit position 16 to 18 to avoid cache conflicts.

[Benchmarking Result] request processing performance improvement,
compared to the original kernel(2.5.0)>
2.5.0 + Davide's Patch ... +11.8%up
2.5.0 + Davide's Patch ... +11.8%up
(STACK_COLOUR_BITS = 0)
2.5.0 + Davide's Patch* ... +11.9%up
(alternate version with [1](a))

Considering these result, the effects of stack colouring is very
slightly(+0.1%). This patch's major effects is task_struct colouring,
and we had no performance gains by the stack colouring at least in our
experimentation.

-----
Computer Systems Laboratories, Fujitsu Labs.
Shuji YAMAMURA (yamamura@flab.fujitsu.co.jp)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/