Re: [RFC] Simple NUMA scheduler patch

Erich Focht (efocht@ess.nec.de)
Wed, 2 Oct 2002 19:54:39 +0200


Hi Michael,

On Wednesday 02 October 2002 01:55, Michael Hohnbaum wrote:
> Attached is a patch which provides a rudimentary NUMA scheduler.
> This patch basically does two things:
>
> * at exec() it finds the least loaded CPU to assign a task to;
> * at load_balance() (find_busiest_queue() actually) it favors
> cpus on the same node for taking tasks from.

it's a start. But I'm afraid a full solution will need much more code
(which is one of the problems with my NUMA scheduler patch).

The ideas behind your patch are:
1. Do initial load balancing, choose the least loaded CPU at the
beginning of exec().
2. Favor own node for stealing if any CPU on the own node is >25%
more loaded. Otherwise steal from another CPU if that one is >100%
more loaded.

1. is fine but should ideally aim for equal load among nodes. In
the current form I believe that the original load balancer does the
job right after fork() (which must have happened before exec()). As
you changed the original load balancer, you really need this initial
balancing.

2. is ok as it makes it harder to change the node. But again, you don't
aim at equally balanced nodes. And: if the task gets away from the node
on which it got its memory, it has no reason to ever come back to it.

For a final solution I believe that we will need targets like:
(a) equally balance nodes
(b) return tasks to the nodes where their memory is
(c) make nodes "sticky" for tasks which have their memory on them,
"repulsive" for other tasks.
But for a first attempt to get the scheduler more NUMA aware all this
might be just too much.

With simple benchmarks you will most probably beat the plain O(1)
scheduler on NUMA if you implement (a) in just 1. and 2. as your node
is already somewhat "sticky". In complicated benchmarks (like a kernel
compile ;-) it could already be too difficult to understand when the
load balancer did what and why...

It would be nice to see some numbers.

Best regards,
Erich

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/