Re: What to expect with the 2.6 VM

William Lee Irwin III (wli@holomorphy.com)
Thu, 3 Jul 2003 13:16:07 -0700


On Thu, Jul 03, 2003 at 11:53:41AM -0700, William Lee Irwin III wrote:
>> No, it is not true that pagetable space can be wantonly wasted
>> on 64-bit.
>> Try mmap()'ing something sufficiently huge and accessing on average
>> every PAGE_SIZE'th virtual page, in a single-threaded single process.
>> e.g. various indexing schemes might do this. This is 1 pagetable page
>> per page of data (worse if shared), which blows major goats.

On Thu, Jul 03, 2003 at 09:27:50PM +0200, Andrea Arcangeli wrote:
> that's the very old exploit that touches 1 page per pmd.

It's not an exploit. It's called "random access" or a "sparse mapping".

On Thu, Jul 03, 2003 at 11:53:41AM -0700, William Lee Irwin III wrote:
>> There's a reason why those things use inverted pagetables... at any
>> rate, compacting virtualspace with remap_file_pages() solves it too.
>> Large pages won't help, since the data isn't contiguous.

On Thu, Jul 03, 2003 at 09:27:50PM +0200, Andrea Arcangeli wrote:
> if you can't use a sane design it's not a kernel issue. this is bad
> userspace code seeking like crazy on disk too, working around it with a
> kernel feature sounds worthless. If algorithms have no locality at all,
> and they spread 1 page per pmd that's their problem.

Hashtables and B-trees are sane designs in my book.

The entire point of the exercise on 64-bit is to allow indexing
algorithms _some_ method of restoring virtual locality so as to
cooperate with the kernel and not throw away vast amounts of space
on pagetables.

Locality of reference in itself is typically already present in the
data access patterns.

On Thu, Jul 03, 2003 at 09:27:50PM +0200, Andrea Arcangeli wrote:
> the easiest way to waste ram with bad code is to add this in the first
> line of the main of a program:
> p = malloc(1G)
> bzero(p, 1G)
> you don't need 1 page per pmd to waste ram. Should we also write a
> kernel feature that checks if the page is zero and drops it so the above
> won't swap etc..?

I don't know what kind of moron you take me for but I don't care to be
patronized like that.

There is a very, very large difference between utter idiocy like the
above and sparse mappings. And there is a _much_ larger difference
when you add in direct cooperation from userspace to conserve resources
otherwise consumed by sparse mappings by compacting virtualspace with
remap_file_pages() and bullshit like implicitly checking if user pages
are 0 before trying to swap them out.

Color me thoroughly disgusted.

On Thu, Jul 03, 2003 at 09:27:50PM +0200, Andrea Arcangeli wrote:
> If you can come up with a real life example where the 1 page per pmd
> scattered over 1T of address space (we're talking about the file here of
> course, the on disk representation of the data) is the very best design
> possible ever (without any concept of locality at all) and it speeds up
> things of orders of magnitude not to have any locality at all,
> especially for the huge seeking it will generate no matter what the
> pagetable side is, you will then change my mind about it.

It doesn't even need to be an artifact of a design like a hash table or
a B-tree. It can be totally linear and 1:1 but the inputs will mandate
random access.

This entire tangent is ridiculous; the entire counterargument centers
around some desire not to go through the straightforward steps to
support a preexisting feature on the pretext of some notion that
cooperating with the kernel so as not to throw away vast amounts of
space on pagetables and/or vma's is an invalid application behavior.

-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/