Re: Better CLONE_SETTLS support for Hammer

Andi Kleen (ak@suse.de)
Thu, 6 Mar 2003 11:27:20 +0100


On Wed, Mar 05, 2003 at 08:14:18PM -0800, Ulrich Drepper wrote:
>
> > It should already work on the current kernel, modulo clone.
> > (but arch_prctl, set_thread_area in 2.5, ldt in 2.4 etc.)
>
> I cannot confirm this. I wasted a lot of time on getting it to work.
> Without avail.

We're using %fs enabled glibc (currently with arch_prctl, but I would
like to change that because it's slow)

> And the problem is? Nobody must mug around with the segment registers
> without knowing what s/he does.

It's hardcoding the magic fs register in the kernel for once.

> You don't need two interface. Make prctl() do it automatically. It has
> all the info it needs. Forget about the set_thread_area syscall in
> 64-bit mode and simply use one fixed GDT entry in case the address
> passed to pcrtl() is small enough. Same for clone(): the SETTLS
> parameter shole be a simple address. Treat it as passed to prctl() and
> use a segment or the MSR.

I had some code like this for some time - not in prctl, but in set_thread_area,
but I removed it because the selector messing looked too ugly.

But that was before prctl reloaded the selector forcefully to zero.
That was later changed to fix another bug.

Now with it getting reloaded it would make sense to set it in the GDT
too if possible, I agree. I'll implement that.

It will also transparent speed up the glibcs already using arch_prctl.
I like that.

I can do a similar thing in clone. It unfortuately also hardcodes fs there,
but I guess that ugly hack will be needed to get the broken NPTL design for this
to work.

You just have to guarantee from user space that you don't do nasty
things with the selector.

>
> - - have prctl() return the index and expect the user to load it. This is
> slightly binary incompatible (existing code depends on no such
> requirement). It could be solved by introducing ARCH_SET_FS_AUTO or
> so;
>
> - - automatically load the %fs or %gs register with the correct value
> before returning from prctl(). This introduces no binary
> incompatibilities and it's really the expected behavior.

It's already done (set to zero) to not confuse the lazy switch logic.

>
>
> If you don't want to do the work help me to get 2.5 running on my
> machine and I'll come up with a patch.

2.5.64 currently doesn't boot (known issue); 2.5.63 works however.
I'll look into the .64 problems later today and put a fix onto the
usual place when done (ftp://ftp.x86-64.org/pub/linux/v2.5/)

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/