Re: context switch vs. signal delivery [was: Re: Accelerating user mode linux]

Andi Kleen (ak@muc.de)
04 Aug 2002 08:46:40 +0200

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Linus Torvalds: "Re: [PATCH] Caches that shrink automatically"
Previous message: J.A. Magallon: "Re: [PATCH] [RFC] [2.5 i386] GCC 3.1 -march support, PPRO_FENCE reduction, prefetch fixes and other CPU-related changes"

Ingo Molnar <mingo@elte.hu> writes:

> actually the opposite is true, on a 2.2 GHz P4:
>
> $ ./lat_sig catch
> Signal handler overhead: 3.091 microseconds
>
> $ ./lat_ctx -s 0 2
> 2 0.90
>
> ie. *process to process* context switches are 3.4 times faster than signal
> delivery. Ie. we can switch to a helper thread and back, and still be
> faster than a *single* signal.

This is because the signal save/restore does a lot of unnecessary stuff.
One optimization I implemented at one time was adding a SA_NOFP signal
bit that told the kernel that the signal handler did not intend
to modify floating point state (few signal handlers need FP) It would
not save the FPU state then and reached quite some speedup in signal
latency.

Linux got a lot slower in signal delivery when the SSE2 support was
added. That got this speed back.

The target were certain applications that use signal handlers for async
IO.

If there is interest I can dig up the old patches. They were really simple.

x86-64 does it also faster by FXSAVE'ing directly to the user space
frame with exception handling instead of copying manually. But that's
not possible in i386 because it still has to use the baroque iBCS
FP context format on the stack.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Linus Torvalds: "Re: [PATCH] Caches that shrink automatically"
Previous message: J.A. Magallon: "Re: [PATCH] [RFC] [2.5 i386] GCC 3.1 -march support, PPRO_FENCE reduction, prefetch fixes and other CPU-related changes"