"absorbing" is a nice word for it. The way I see it, page_add_rmap and
page_remove_rmap are even more expensive than the pagtable zapping.
They're even more expensive than copy_page_range. Also focus on the
numbers on the right that are even more interesting to find what is
worth to optimize away first IMHO
> These things aren't cheap with or without rmap. Trimming down
lots of things aren't cheap, but this isn't a good reason to make them
twice more expensive, especially if they were as cheap as possible and
they're critical hot paths.
> accounting overhead could raise search problems elsewhere.
this is the point indeed, but at least in 2.4 I don't see any cpu saving
advantage during swapping because during swapping the cpu is always idle
Infact I had to drop the lru_cache_add too from the anonymous page fault
path because it was wasting way too much cpu to get peak performance (of
course you're using per-page spinlocks by hand with rmap, and
lru_cache_add needs a global spinlock, so at least rmap shouldn't
introduce very big scalability issue unlike the lru_cache_add)
> Whether avoiding the search problem is worth the accounting overhead
> could probably use some more investigation, like actually trying the
> anonymous page handling rework needed to use vma-based ptov resolution.
the only solution is to do rmap lazily, i.e. to start building the rmap
during swapping by walking the pagetables, basically exactly like I
refill the lru with anonymous pages only after I start to need this
information recently in my 2.4 tree, so if you never need to pageout
heavily several giga of ram (like most of very high end numa servers),
you'll never waste a single cycle in locking or whatever other worthless
accounting overhead that hurts performance of all common workloads
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to email@example.com
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/