RE: [Lse-tech] Re: [RFC] Dynamic percpu data allocator

BALBIR SINGH (balbir.singh@wipro.com)
Fri, 24 May 2002 17:29:28 +0530


This is a multi-part message in MIME format.

------=_NextPartTM-000-e6e4ff64-5700-4144-bbab-6efecd8a8e9d
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit

Thanks, when I ment CPU local memory on SMP, I ment CPU cache.
Sorry for the confusion. I think your dynamic allocator makes
sense.

Balbir

|
|On Fri, May 24, 2002 at 02:08:50PM +0530, BALBIR SINGH wrote:
|>
|> Sure, I understand what you are talking about now. That makes a lot
|> of sense, I will go through your document once more and read it.
|> I was thinking of the two combined (allocating CPU local memory
|> for certain data structs also includes allocating one copy per CPU).
|> Is there a reason to delay the implementation of CPU local memory,
|> or are we waiting for NUMA guys to do it? Is it not useful in an
|> SMP system to allocate CPU local memory?
|
|In an SMP system, the entire memory is equidistant from the CPUs.
|So, any memory that is exclusively accessed by once cpu only
|is CPU-local. On a NUMA machine however that isn't true, so
|you need special schemes.
|
|The thing about one-copy-per-cpu allocator that I describe is that
|it interleaves per-cpu data to save on space. That is if you
|allocate per-cpu ints i1, i2, it will be laid out in memory like this -
|
| CPU #0 CPU#1
|
| --------- --------- Start of cache line
| i1 i1
| i2 i2
|
| . .
| . .
| . .
| . .
| . .
|
| --------- ---------- End of cache line
|
|The per-cpu copies of i1 and i2 for CPU #0 and CPU #1 are allocated from
|different cache lines of memory, but copy of i1 and i2 for CPU #0 are
|in the same cache line. This interleaving saves space by avoiding
|the need to pad small data structures to cache line sizes.
|This essentially how the static per-cpu data area in 2.5 kernel
|is laid out in memory. Since copies for CPU #0 and CPU #1 for
|the same variable are on different cache lines, assuming that
|code that accesses "this" CPU's copy will not result in cache line
|bouncing. On an SMP machine, I can allocate the cache lines
|for different CPUs, where the interleaved data structures are
|laid out, using the slab allocator. On a NUMA machine however,
|I would want to make sure that cache line allocated for this
|purpose for CPU #N is closest possible to CPU #N.
|
|
|Thanks
|--
|Dipankar Sarma <dipankar@in.ibm.com> http://lse.sourceforge.net
|Linux Technology Center, IBM Software Lab, Bangalore, India.
|-
|To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
|the body of a message to majordomo@vger.kernel.org
|More majordomo info at http://vger.kernel.org/majordomo-info.html
|Please read the FAQ at http://www.tux.org/lkml/

------=_NextPartTM-000-e6e4ff64-5700-4144-bbab-6efecd8a8e9d
Content-Type: text/plain;
name="Wipro_Disclaimer.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
filename="Wipro_Disclaimer.txt"

**************************Disclaimer************************************

Information contained in this E-MAIL being proprietary to Wipro Limited is
'privileged' and 'confidential' and intended for use only by the individual
or entity to which it is addressed. You are notified that any use, copying
or dissemination of the information contained in the E-MAIL in any manner
whatsoever is strictly prohibited.

***************************************************************************

------=_NextPartTM-000-e6e4ff64-5700-4144-bbab-6efecd8a8e9d--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/