Re: VM Report was:Re: Break 2.4 VM in five easy steps

Andrew Morton (andrewm@uow.edu.au)
Sat, 09 Jun 2001 12:36:24 +1000


Jonathan Morton wrote:
>
> [ Re-entering discussion after too long a day and a long sleep... ]
>
> >> There is the problem in terms of some people want pure interactive
> >> performance, while others are looking for throughput over all else,
> >> but those are both extremes of the spectrum. Though I suspect
> >> raw throughput is the less wanted (in terms of numbers of systems)
> >> than keeping interactive response good during VM pressure.
> >
> >And this raises a very very important point: raw throughtput wins
> >enterprise-like benchmarks, and the enterprise people are the ones who pay
> >most of hackers here. (including me and Rik)
>
> Very true. As well as the fact that interactivity is much harder to
> measure. The question is, what is interactivity (from the kernel's
> perspective)? It usually means small(ish) processes with intermittent
> working-set and CPU requirements. These types of process can safely be
> swapped out when not immediately in use, but the kernel has to be able to
> page them in quite quickly when needed. Doing that under heavy load is
> very non-trivial.

For the low-latency stuff, latency can be defined as
the worst-case time to schedule a userspace process in
response to an interrupt.

That metric is also appropriate in this case, (latency equals
interactivity), although here you don't need to be so fanatical
about the *worst case*. A few scheduling blips here are less
fatal.

I have tools to measure latency (aka interactivity). At
http://www.uow.edu.au/~andrewm/linux/schedlat.html#downloads
there is a kernel patch called `rtc-debug' which causes
the PC RTC to generate a stream of interrupts. A user-space
task called `amlat' responds to those interrupts and
reads the RTC device. The patched RTC driver can then
measure the elapsed time between the interrupt and the
read from userspace. Voila: latency.

When you close the RTC device (by killing amlat), the RTC
driver will print out a histogram of the latencies.

`amlat' at present gives itself SCHED_RR policy and
runs under mlockall() - for your testing you'll need
to delete those lines.

So. Simple apply rtc-debug, run `amlat' and kill it
when you've finished the workload.

The challenge will be to relate the latency histogram
to human-perceived interactivity. I'm not sure of
the best way of doing that. Perhaps monitor the 90th
percentile, and aim to keep it below 100 milliseconds.
Also, `amlat' should do a bit of disk I/O as well.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/