Re: [patch] severe softirq handling performance bug, fix, 2.4.5

Andrea Arcangeli (andrea@suse.de)
Sun, 27 May 2001 22:24:45 +0200


On Sun, May 27, 2001 at 12:09:29PM -0700, David S. Miller wrote:
> "live lock". What do you hope to avoid by pushing softirq processing
> into a scheduled task? I think doing that is a stupid idea.

NOTE: I'm not pushing anything out of the atomic context, I'm using
ksoftirqd only to cure the cases that are getting wrong by the fast-path
code.

It fixes the 1/HZ latency with the idle task and it gets right the case
of softirq marked pending again under do_softirq.

> You are saying that if we are getting a lot of soft irqs we should
> defer it so that we can leave the trap handler, to avoid "live lock".

Yes, that is what all kernels out there are doing.

> I think this is a bogus scheme for several reasons. First of all,
> deferring the processing will only increase the likelyhood that the
> locality of the data will be lost, making the system work harder.

Of course almost all the time the processing is not deferred.

> Secondly, if we are getting softirqs at such a rate, we have other
> problems. We are likely getting surged with hardware interrupts, and
> until we have Jamals stuff in to move ethernet hardware interrupt
> handling into softirqs your deferrals will be fruitless when they do
> actually trigger. We will be livelocked in hardware interrupt
> processing instead of being livelocked in softirq processing, what an
> incredible improvement. :-)

The sofitrq could be marked running also from another softirq, it
doesn't need to be an interrupt.

> from trap, no matter what kind, call do_softirq if softirqs are
> pending.

that just happens, I assume you mean you prefer to remove mask &=
~active from do_softirq() internally.

But doing that will hang in irq as soon as a softirq or tasklet or
that marks itself running again in software (and I think this happens
just now in some driver to poll the hardware from atomic context where
you cannot schedule). Furthmore the softirq is going to take more time
than the irq core itself, so the live lock issue is not so obvious as
you say I think.

> Again, I am totally against ksoftirqd, I think it's a completely dumb
> idea. Softirqs were meant to be as light weight as possible, don't
> crap on them like this with this heavyweight live lock "solution".
> It isn't even solving live locks, it's rather trading one kind for
> another with zero improvement.

softirq is still as light as possible but without ksoftirq the logic is
wrong in some case, and so it can get a seneless 1/HZ latency sometime.
ksofitrqd fixes all those broken cases in a clean manner.

I'd like to know how much it helps on the gigabit benchmarks. For the
100mbit ethernet that I can test locally it is fine.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/