Re: [2.4.17/18pre] VM and swap - it's really unusable

Andrea Arcangeli (andrea@suse.de)
Wed, 9 Jan 2002 12:24:18 +0100


On Tue, Jan 08, 2002 at 03:55:38PM -0500, Robert Love wrote:
> On Tue, 2002-01-08 at 10:29, Andrea Arcangeli wrote:
>
> > "extra schedule points all over the place", that's the -preempt kernel
> > not the lowlatency kernel! (on yeah, you don't see them in the source
> > but ask your CPU if it sees them)
>
> How so? The branch on drop of the last lock? It's not a factor in

exactly, this is the reschedule point I meant. Oh note that it's
unlikely also in the lowlatecy patch. Please count the number of time
you add this branch in the -preempt, and how many times we add this
branch in the lowlat and then tell me who is adding rescheduling points
in the kernel all over the place.

> This makes me think the end conclusion would be that preemptive
> multitasking in general is bad. Why don't we increase the timeslice and
> and tick period, in that case?

that would increase performance, but we'd lost interactivity.

> One can argue the complexity degrades performance, but tests show
> otherwise. In throughput and latency. Besides, like I always say, its

which benchmarks? you should make sure the CPU spend all its cycles in
the kernel to benchmark the perfrormance degradation (this is the normal
case of webserving with a few gigabit ethernet cards using sendfile).

> ride. On the other hand, the patch has a _huge_ userbase and you can't

I question this because it is too risky to apply. There is no way any
distribution or production system could ever consider applying the
preempt kernel and ship it in its next kernel update 2.4. You never know
if a driver will deadlock because it is doing a test and set bit busy
loop by hand instead of using spin_lock and you cannot audit all the
device drivers out there. It is not like the VM that is self contained
and that can be replaced without any caller noticing, this instead
impacts every single driver out there and you'd need to audit all of
them, which is not feasible I think and that should be done by giving
everybody the time to test. This is also what makes preempt config
option risky, if we go preempt we should force everybody to use it, at
least during 2.5, so we get the useful feedback from testers of all the
hardware, or nobody could trust -preempt.

NOTE: I trust your work with spinlocks, locks around per-cpu data
structures etc.. is perfect, I trust that part, as said it's the driver
doing test and set bit that you cannot audit that is the problem here
and that makes it potentially unstable, not your changes. And also the
per-cpu data structures sounds a little risky (but for example for UP
that's not an issue).

> question that. You also can't question the benchmarks that show
> improvements in average _and_ worst case latency _and_ throughput.

I don't question some benchmark is faster with -preempt, the interesting
thing is to find why because it shouldn't be the case, Andrew for
example mentioned software raid, there are good reasons for which
-preempt could be faster there, so we added a single sechdule point and
we just have that case covered in 18pre2aa1, we don't need reschedule
points all over the place like in -preempt to cover things like that.
It is good to find them out so we can fix those bugs, I consider them
bugs :).

Again: I'm not completly against preempt, it can reach an mean latency
much lower than mainline (it can reschedule immediatly in the middle of
long copy-users for example), so it definitely has a value, it's just
that I'm not sure if it worth it.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/