Re: [Bug 350] New: i386 context switch very slow compared to 2.4 due to wrmsr (performance)

Linus Torvalds (torvalds@transmeta.com)
Wed, 12 Feb 2003 05:54:37 +0000 (UTC)


In article <20030212041848.GA9273@bjl1.jlokier.co.uk>,
Jamie Lokier <jamie@shareable.org> wrote:
>
>A cute and wonderful hack is to use the 6 words in the TSS prior to
>&tss->es as the trampoline. Now that __switch_to is done in software,
>those words are not used for anything else.

No!!

That's not cute and wonderful, that's _horrible_.

Mixing data and code on the same page is very very slow on a P4 (well, I
think it's "same half-page", but the point is that you should not EVER
mix data and code - it ends up being slow on modern CPU's).

>Other fixed offsets from &tss->esp0 are possible - especially nice
>would be to share a cache line with the GDT's hot cache line. (To do
>this, place GDT before TSS, make KERNEL_CS near the end of the GDT,
>and then the accesses to GDT, trampoline and tss->esp0 will all touch
>the same cache line if you're lucky).

Since almost all x86 CPU's have some kind of cacheline exclusion policy
between the I$ and the D$ (to handle the strict x86 I$ coherency
requirements), your "if you're lucky" is completely bogus. In fact,
you'd be the _pessimal_ cache behaviour for something like that, ie you
get lines that ping-pong between the L2 and the two instruction caches.

Don't do it. Keep data and code on separate pages.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/