Re: Note describing poor dcache utilization under high memory pressure

Linus Torvalds (torvalds@transmeta.com)
Tue, 29 Jan 2002 12:49:24 -0800 (PST)


On Tue, 29 Jan 2002, William Lee Irwin III wrote:
>
> In my mind it's not about the form but about how much of it is exposed.
> For instance, exposing the number of levels seems to require emulating
> an extra level for machines with 2-level pagetables.

Well, you have two choices:

- _not_ exposing fundamental details like the number of levels causes
different architectures to have wildly different code (see how UNIX
traditionally does MM, and puke)

- trivial "folding" macros to take 3 levels down to 2 (or four levels
down to 3 or two).

Note that the folding macros really _are_ trivial. The pmd macros for x86
are basically these few lines:

static inline int pgd_none(pgd_t pgd) { return 0; }
static inline int pgd_bad(pgd_t pgd) { return 0; }
static inline int pgd_present(pgd_t pgd) { return 1; }
#define pgd_clear(xp) do { } while (0)

static inline pmd_t * pmd_offset(pgd_t * dir, unsigned long address)
{
return (pmd_t *) dir;
}

And that's it.

So I'd much rather have a generic VM and do some trivial folding.

> It's quite a happy coincidence when this happens, and in my mind making
> it happen more often would be quite nice.

I really isn't a co-incidence. The reason so many architectures have page
table trees is that most architects try to make good decisions, and a tree
layout is a simple and efficient data structure that maps well to both
hardware and to usage patterns.

Hashed page tables are incredibly naive, and perform badly for build-up
and tear-down (and mostly have horrible cache access patterns). At least
in some version of the UltraSparc, the Linux tree-based software TLB fill
outperformed the Solaris version, even though the Solaris version was
handtuned assembly and used hardware acceleration for the hash
computations. That should tell you something.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/