Creating a per-task kernel space for kmap, user pagetables, et al

Martin J. Bligh (Martin.Bligh@us.ibm.com)
Wed, 20 Mar 2002 11:09:05 -0800


Please forgive the following for deliberate simplifiations and accidental misunderstandings.
If you flame gently, *maybe* we can pull something useful out of the ashen remains ;-)

This is an evolution of my original plan to do per-cpu persisent kmap pools a
couple of months ago - thanks to Rik for pointing out this was not going to work.

OK, so currently we divide the virtual address space into user space (u-space) and kernel
space (k-space) at the PAGE_OFFSET boundary, with two fundamental differences that I'm
interested in for the sake of this argument.

1. u-space has different protections from k-space (ie user can't read/write it directly)

2. u-space is per task. k-space is common across all tasks.

Imagine we create a hybrid "u-k-space" with the protections of k-space, but the locality
of u-space .... either by making part of the current k-space per task or by making part of
the current u-space protected like k-space ... not sure which would be easier.

This u-k-space would be a good area for at least two things (and probably others):

1. A good place to put the process pagetables. We only use up the amount of virtual
address space (vaddr space) for one task's pagetables - if we map them into ZONE_NORMAL
(as current mainline) we use up vaddr space for *all* task's pagetables - if we map them
through kmap (atomic or persistent), we pay dearly in tlbflushes.

2. A good place to make a per-task kmap area. This would be on a pool system similar to
the current persistent kmap. We would potentially do only a local cpu tlb_flush_all when
this table ran out (though if we're clever, we can use the context switches tlb_flush to
do this for us). This would make copy_to_user stuff that's currently done under kmap
cheaper.

This, unfortunately, isn't a total solution - we may sometimes need to modify the task's
pagetables from outside the process context, eg. swapout (thanks to dmc for pointing
this out to me ;-)). For this, we'd just use the existing kmap mechanism to create another
mapping to use temporarily, and we're no worse off than before. But on the whole I think
it wins us enough to be worthwhile.

Opinions?

Martin.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/