Re: [PATCH 4/3] Replace dynamic percpu implementation

Rusty Russell (rusty@au1.ibm.com)
Fri, 23 May 2003 09:56:01 +1000


In message <20030522104944.GE27614@in.ibm.com> you write:
> On Thu, May 22, 2003 at 06:36:31PM +1000, Rusty Russell wrote:
> > Interesting: personally I consider the cacheline sharing a feature,
> > and unless you've done something special, the static declaration
> > should be interlaced too, no?
>
> Yes, the static declartion was interlaced too. What I meant to say is that
> cacheline sharing feature helped alloc_percpu/static percpu, compensate
> for the small extra memory reference cost in getting __percpu_offset[]
> when you compare with kmalloc_percpu_new.

Ah, thanks, that clarifies. Sorry for my misread.

> > Aside: if kmalloc_percpu uses the per-cpu offset too, it probably
> > makes sense to make the per-cpu offset to a first class citizen, and
> > smp_processor_id to be derived, rather than the other way around as at
> > the moment. This would offer further speedup by removing a level of
> > indirection.
> >
> > If you're interested I can probably produce such a patch for x86...
>
> Sure, it might help per-cpu data but will it cause performance
> regression elsewhere? (other users of smp_processor_id).

AFAICT, all the time-critical smp_processor_id() things are basically
for indexing into a per-cpu data array. Even things like module.h and
percpu_counter.h would benifit from replacing those huge
inside-structure [NR_CPUS] arrays with a dynamic allocation.

> I can run it through the same tests and find out. Maybe it'll make
> good paper material for later? ;)

I'll try to find time today or early next week.

Thanks!
Rusty.

--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/