> a) ``Other operating systems offer "soft realtime" and we don't want to
> port our code from that to the RT Linux model.'' Translation:
> ``other operating systems have made poor design decisions, let's try
> and pressure Linus into doing the same thing with Linux.'' ...
You should investigate the problem more deeply before saying
"we do not want to port our code" ..
Do you really think that the multimedia driver folks are willing to port
hundreds of thousand of lines to the RTLinux model and then mantain 2 sets of
APIs to satisfy the realtime multimedia folks ?
please read the attached mails from Paul and me CAREFULLY, to get an idea WHY
RTLinux is not the best solution from multiple points of view:
some quick examples:
- target audience : how many of the of the millions of linux desktops have
RTLinux installed , how many distros ship it/install it by default, when will
Linus accept RTLinux into the mainstream kernel, when will Joe Sixpack
be able to buy Redhat 12.0 and run his smooth playing software DVD-decoder
or run his cool software-audio-synth ?
The frustrating thing is that Ingo's patch although ugly, can cover
ALL realtime multimedia needs NOW, and this better than Windows and Mac with a
simple userspace programming model. (opposed to win & mac developers forced
to stay at kernel level and/or talking directly to the hardware in order to get
good latencies because the OS is broken)
- too restrictive : problems with large memory needs,
working in kernelspace (do we really want to run COMPLEX apps within the
kernel space (a physical modelling synth) ? ,
debugging disadvantages (in userspace code add a few printf()s to debug
stuff, remove SCHED_FIFO so that you can press CTRL-C if something goes wrong),
no interaction with multimedia drivers like audio, MIDI, framebuffer and
video4linux
- the perceivable "latencies" of ear and eye are limited: ( ear <5ms latencies
are perceived as realtime, eye: more than 25-30 frames/sec ( = 30ms ) are
perceived as fluid animation)
So why should I use RTLinux which can do 20-50usec when 99.99% of cases can be
covered using and userspace app running SCHED_FIFO / mlock() because most
apps do not need <5msec latencies.
Victor Yodaiken asked if some of the linux-audio-dev folks consider
using RTLinux for audio, and most of the responsed were negative,
see the attached text below
( do not forget that as soon you try to drive audio latencies under a certain
value, you will get so many context switches / sec that it will trash the
CPU's cache plus assuming you are running chained plugins,
the CPU will spend more time doing procedure calls , stack ping-pong
and DSP-algorithm setup, than doing actual USEFUL DSP stuff.
PS: every few days I get a mail from people asking NON audio-related stuff
like:
"I have an application which polls an external controlling device with a
frequency of 200HZ. Do I need RTLinux for this ?"
Ingo's kernel can easily meet this criteria thus the app above becomes a simple:
set SCHED_FIFO
mlockall()
while(1)
{
sleep_5ms() (using RTC for example)
poll_device()
record_your_data()
}
I think this what Montavista thinks:
many "realtime" applications (even many of the embedded ones) do not require
usec accuracy but only 1-5msec reaction time, and this can be done comfortably
staying in userspace , provided that the kernel can deliver good latencies.
PS: have you thought about the speedup that a low-latency kernel could give
to a beowulf cluster when heavy disk I/O occurs:
assume a 200 node beowulf where all machines wait on a barrier on every
calculation iteration: if one machine stalls for 100msecs (happens during heavy
disk IO), 200 machines will have to wait for the stalled node, thus 200 CPUs
will lose 100msec of processing time.
Since these 100ms stalls can potentially happen (current kernels) every 30-60
secs on each box, assuming uniform distribution a stall could happen every
60/200 secs = 0.3 secs (worst case) , so losing 100msecs every 300msecs
means a performance drop of up to 20-30% .
NOT NICE.
Benno.
Paul Barton-Davis wrote:
>I don't see why it would require a "complete rewrite".
the ALSA drivers currently represent a minimum of 64000 C statements
(i counted semicolons in the files; there are 174,000 lines in total,
including comments).
although i strongly suspect that most of the work involved in porting
them to RTLinux could be done with a portability "wrapper", i also
strongly suspect that we'd have to go over every single line of every
file and check it for various assumptions.
the doesn't include the video/tv interfaces needed by actual
MULTI-media apps (as opposed to multi-media).
no, its not a "complete rewrite", but its about as close to it as you
can get without it actually becoming one.
>> i think, however, that all of this is moot. For almost all of us, the>> performance offered by Ingo's patches to 2.2.{10,11,12,13,14,15} is
>> more than adequate. Using RTLinux involves, at the absolute minimum,
>> some real work, a definite learning experience, and possibly a
>> redesign of already operational applications. To gain what ? Better
>> performance than we actually need. victor, its a tough sell ;)
>
>Someone told me that about "C" in 1980. How's that for a stunningly
>irrelevant response?
>I think that the motivations of performance and intellectual
>curiosity are strong.
take a look at the hardware.
44.1kHz -> 1 sample = 22usec
48kHz -> 1 sample = 20usec
96kHz -> 1 sample = 10usec
192kHz -> 1 sample = 5usec
so RTLinux would could our performance down to roughly the same level
as the converters.
alas, its nots so simple. there are always buffers on the audio
interface which accumulate sample data in both directions. plus, there
is the PCI bus burst transfer size (64 bytes). so, lets assume a
minimum transfer size of 64 bytes:
16 32 bit samples @ 48kHz: 320usec
32 16 bit samples @ 48kHz: 640usec
so, clearly, RTLinux would get us down to the h/w limits of today's
audio interfaces.
this is an attractive goal in some senses, except for the fact that as
Benno has pointed out, doing a single pass through our audio algorithm
is probably going to cost *at least* the same time *just for the
overhead* (i.e forget any actual computation).
so, its back to the age-old (well, as old as the Music N language
family) tradeoff between a "blocksize" (number of samples processed
per iteration) that favors rapid response and one that favors high
efficiency.
realistically, people have found with programs like Csound that the
best compromise value is probably on the order of a 100 to 1000
samples depending on the application. in some cases, a sample size of
1 is necessary (e.g. to properly implement some kinds of
filters/delays).
Lets pick 256 as a nice power of 2 number:
256 samples @48kHz: 5ms
Presto: we can do 2.5ms with Ingo's code. So we're in the clear here
for almost all applications.
There are a few cases where the ability to do "better" than the h/w
can is a win. But these are very rare - I don't know of any current
pro-audio gear that even attempts such a thing. Single-sample transfer
is common for thru routing, but its rare when doing signal processing.
My take-home point; RTLinux is a fabulous piece of work. But it lets
us do better than we need to at the cost of a bunch of driver porting,
app redesign, and some learning, contrasting with the low latency
patches which involve no change in app design (i.e code runs on
dog-slow latency systems too, albeit poorly), no "porting" of drivers
to a new "platform", and the chance to carry on as usual. ah, what a
pathetic specimen i am.
--p
Benno Senoner wrote:
Paul's summary is quite complete, (blocksize tradeoffs vs efficiency)
(plus do not forget cache trashing when doing 5000 context
switches per second)
What I want to add is although Victor's offer is nice,
besides the fact that the ALSA , video4linux, framebuffer folks
(eg the multimedia driver writers) are not willing and have not the time
to port the code to RTLinux and mantain two sets of drivers,
I have another big fear:
(and 1ms latencies is not a big deal on CPUs which can do
500 million operations/sec = 500K ops per msec.)
And as Paul pointed out audio latency limitations like
PCI burst size = 64 bytes , blocksizes have to be bigger than a certain
amount ( eg >1-2ms worth of audio) if you don't want to burn
80% of the available CPU cycles doing useless stuff instead
of actual DSP work ecc.
make look RTLinux like using a Ferrari when driving on a mountain road:
expensive and you will always be forced to drive at low speeds (1st,2nd gear).
BTW: one advantage of debugging userspace realtime audio apps:
during debug phase do not set SCHED_FIFO and use big blocksizes
(there will be a delay due to bigger buffersizes, but audio will play
correclty).
Using bigger buffersize , you can even use printf() within the main loop
to display for example buffer pointers and other internal variables.
As soon as you app feels bugfree,
remove the printf()s , decrease the buffersize to the desired amount,
and enable SCHED_FIFO , mlock().
I use this technique ALL the time and has been very helpful to debug my
stuff and speed up development.
I don't think that this is easily possible under RTLinux.
I think that this userspace vs RTLinux programming model is one of the key
points of Montavista. ( ease of use, does not require the learning of new
interfaces and speedup in developement time)
Even if they do not aim at achieving 20-50usec latencies as RTLinux does,
they say many applications (embedded and non-embedded) can live
with 1ms worst-case latencies.
Video is similar to audio, a 200Hz frame update rate doesn't buy us much
over a 60Hz update frequency since the reaction speed of the retina is limited.
( cinema is 24FPS , TV 50/60 Hz interlaced)
and 60HZ = 16msec = PLENTY of time for a lowlatency kernel which has
worst case latencies around 1-2msecs.
So as you can see video is even less demanding than audio.
Audio is probably the most demanding multimedia field in terms of
latency , so my conclusion is that the lowlatency way is adequate to cover
all needs of a modern multimedia desktop.
Of course driver writers are required to meet certain criteria,
like not turning IRQs off for long time (eg more than 1ms) and to keep
the codepaths short (in the case of design-constrained long codepaths,
check for rescheduling)
of course we should not drop efforts to research methods to achieve even
better performance than now , therefore I will keep an eye on
RTLinx + audio , but what the linux audio community wants is to have a
solid platform sooner than later.
(eg: load up a plain Redhat 7.0, run you RT audio app (userspace app which runs
SCHED_FIFO) , and get rock solid 3ms latencies, without the need of installing
low-latency patches, RTLinux drivers etc)
I know a few studio folks, and they see the computers only as a tool just as
dedicated racks, they do not care about technical details, their only goal
is to make music and if the PC is not easy to use enough, they will look for
other alternatives.
----------
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/