Re: 64GB NUMA-Q after pgcl

Andrea Arcangeli (andrea@suse.de)
Mon, 31 Mar 2003 01:19:45 +0200


Didn't you break the linux x86 ABI in mmap? the file offset must be a
multiple of the softpagesize and binary apps can break with -EINVAL with
pgcl. The workaround is to allocate it in anon mem but it's not coherent
if somebody does a change to the binary with MAP_SHARED, so still broken
semantics. In theory we could also have aliasing in the cache, but it
doesn't seem a good idea.

Since you have access to such a machine, can you please try to boot
2.4.21pre5aa2 on such a machine? That must boot just fine too according
to my math, however, there will be little normal zone left, compared to
your kernel with pgcl. But 700M of normal zone are still nothing versus
the 64G of ram, what I mean is that even pgcl isn't guaranteeing general
purpose usability of the box (despite making it much more usable).

Note that 2.5 mem_map is bigger due rmap, my 2.4 w/o the rmap slowdown
should boot just fine w/o pgct that breaks the linux x86 ABI and in turn
binary apps at runtime.

I believe pgcl is very interesting for x86 long term, and for all archs
not providing a flexible hard-page-size. The larger softpagesize can
increase performance, 4k page size is too small these days, especially
on a 64bit arch, infact this softpagesize feature would be most
interesting for x86-64 even in the long term (for a totally different
reason of why you find it most interesting today on the 32bit x86 64G
boxes ;). so it sounds like a good thing to have for mainline >=2.5
regardless. All it matters is that you don't try to make the page
allocator returning anything smaller than the softpagesize to avoid
losing reliability of allocations for core data structures.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/