Re: x86 ptep_get_and_clear question

Linus Torvalds (torvalds@transmeta.com)
Fri, 16 Feb 2001 09:36:07 -0800 (PST)


On Fri, 16 Feb 2001, Jamie Lokier wrote:

> Manfred Spraul wrote:
> > Ok, Is there one case were your pragmatic solutions is vastly faster?
>
> > * mprotect: No. The difference is at most one additional locked
> > instruction for each pte.
>
> Oh, what instruction is that?

The "set_pte()" thing could easily be changed into

lock ; orl pte,(ptepointer)

which actually should work as-is. We do not allow "set_pte()" on anything
but "pte_none()" entries anyway, so in the trivial case the "orl" is
exactly equivalent to a "movl". And in the (so far theoretical) case where
another CPU might have set the dirty bit, the locked "or" will again do
the right thing, and preserve it.

So that would basically be a one-liner that removes the set_pte() race for
mprotect() (and the vmscan.c case of re-establishing the pte, but as
vmscan needs to do something more anyway that part is probably not
interesting).

> > * munmap(anon): No. We must handle delayed accessed anyway (don't call
> > free_pages_ok() until flush_tlb_ipi returned). The difference is that we
> > might have to perform a second pass to clear any spurious 0x40 bits.
>
> That second pass is what I had in mind.
>
> > * munmap(file): No. Second pass required for correct msync behaviour.
>
> It is?

Not now it isn't. We just do a msync() + fsync() for msync(MS_SYNC). Which
is admittedly not optimal, but it works.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/