> OLD: when sched_yield() is called the task moves to expired,
> every other task in the active queue will run first before the
> yielding task will run again.
I really think this is the right way.
> NEW: move the yielding task to the end of its current priority level,
> but keeps it active not expired.
This takes us back to the problem we saw in earlier sched_yield()
implementations. A group of yielding threads just round-robin between
themselves, yielding over and over. Worse, even a single task alone in
a priorty level will show up as a CPU hog if it keeps calling
sched_yield() in a loop.
It goes on. Assume we have two runnable tasks, one that does whatever
it wants (hopefully something useful), and the other which does:
With the current sched_yield(), the second will receive much less
processor time than the first (nearly none vs. most of the processor).
With the sched_yield() mentioned above, they will receive identical
amounts of processor time. That does not seem sane to me.
I think it is important that sched_yield() give processor time to all
tasks, and not just between multiple yielding tasks.
The current implementation does this. If an application (*cough* Open
Office *cough*) calls sched_yield() over and over, what does it expect?
Now that we have futexes, sched_yield() no longer needs to be used as a
poor replacement for blocking, and it can have sane semantics, such as
_really_ yielding the processor.
> What else could be done?
> (a) drop the effective priority of the yielding task by a percentile,
> but don't reduce the time slice!
This works, too. We used to do this..
There are a couple bits that need to be added, though, to deal with
threads that call sched_yield() over and over (which are the ones where
we have problems). We need to drop the task a priority level every time
it calls sched_yield(). Eventually it will reach the lowest priority
(or some earlier threshold we want to check for) and then we need to put
it on the expired list, like the current behavior.
So for the big offenders, I think this ends up being the same, no?
Also, this approach does not work for real-time tasks, for whom we must
not change their priority... so we end up just requeing them, too.
Just my thoughts...
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to email@example.com
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/