2.4.20: scheduler issue: bad scheduling latency case

Hiroshi Inoue (inoueh@uranus.dti.ne.jp)
Wed, 07 May 2003 01:13:22 +0900


Hi,

I have found a case which may introduce bad scheduling latency up to
10 msec (or 1/HZ sec) in task scheduler of kernel 2.4.20 at SMP machine.

In schedule(), if other CPU in the system set "need_resched" flag
of task struct within the section showed below in order to request
rescheduling, this reschedule request can be neglected.

case TASK_RUNNING:;
}
***** prev->need_resched = 0; *************** // begin section

/*
* this is the scheduler proper:
*/

(Omission)

/*
* from this point on nothing can prevent us from
* switching to the next task, save this fact in
* sched_data.
*/
***** sched_data->curr = next; ************* // end section
task_set_cpu(next, this_cpu);

This case seems to be very rare, but it was observed that this occurred
several times while I compiled a Linux kernel in my environment (machine
with 2 logical CPUs by Hyper-Threading enabled processor).

A simple patch for this issue is attached.
Does it make sense?

diff -Nru linux-2.4.20-orig/kernel/sched.c linux-2.4.20/kernel/sched.c
--- linux-2.4.20-orig/kernel/sched.c Fri Nov 29 08:53:15 2002
+++ linux-2.4.20/kernel/sched.c Fri Apr 11 16:04:34 2003
@@ -625,6 +625,11 @@
goto repeat_schedule;
}

+ if (unlikely(prev->need_resched)) {
+ prev->need_resched = 0;
+ goto repeat_schedule;
+ }
+
/*
* from this point on nothing can prevent us from
* switching to the next task, save this fact in

Regards,
Hiroshi Inoue <inoueh@uranus.dti.ne.jp>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/