Re: [PATCH] low-latency zap_page_range()

Linus Torvalds (torvalds@transmeta.com)
Thu, 29 Aug 2002 22:37:02 +0000 (UTC)


In article <20020829213830.GG888@holomorphy.com>,
William Lee Irwin III <wli@holomorphy.com> wrote:
>Robert Love wrote:
>>> unless we
>>> wanted to unconditionally drop the locks and let preempt just do the
>>> right thing and also reduce SMP lock contention in the SMP case.
>
>On Thu, Aug 29, 2002 at 01:59:17PM -0700, Andrew Morton wrote:
>> That's an interesting point. page_table_lock is one of those locks
>> which is occasionally held for ages, and frequently held for a short
>> time.
>> I suspect that yes, voluntarily popping the lock during the long holdtimes
>> will allow other CPUs to get on with stuff, and will provide efficiency
>> increases. (It's a pretty lame way of doing that though).
>> But I don't recall seeing nasty page_table_lock spintimes on
>> anyone's lockmeter reports, so...
>
>You will. There are just bigger fish to fry at the moment.

You will NOT.

The page_table_lock protects against page stealing of the VM and
concurrent page-faults, nothing else. There is no way you can get
contention on it under any reasonable load that doesn't involve heavy
out-of-memory behaviour, simply because

- the lock is per-mm
- all "regular" paths that care about this also get the mmap semaphore

In short, that spinlock has _zero_ scalability impact. You can
theoretically get contention on it without memory pressure only by
having hundreds of threads page-faulting at the same time (getting a
read-lock on the mmap semaphore), but by then your performance has
nothing to do with the spinlock, and everything to do with the page
faults themselves.

(In fact, I can almost guarantee that most of the long hold-times are
for exit(), not for munmap(). And in that case the spinlock cannot get
any non-pagestealer contention at all, since nobody else is using the
MM)

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/