> On 7 Mar 2002 12:17:52 -0800
> "H. Peter Anvin" <hpa@zytor.com> wrote:
> 
> 
>>Followup to:  <20020307191043.9C5F33FE15@smtp.linux.ibm.com>
>>By author:    Hubertus Franke <frankeh@watson.ibm.com>
>>In newsgroup: linux.dev.kernel
>>
>>>Take a look at Rusty's futex-1.2, the code is not that different, however
>>>if its all inlined it creates additional code on the critical path 
>>>and why do it if not necessary.
>>>
>>>In this case the futexes are the well tested path, the rest is a cludge on
>>>top of it.
>>>
>>>
>>Perhaps someone could give a high-level description of how these
>>"futexes" work?
>>
> 
> Certainly!  This is how Futexes IV (and Futexes V, ignoring the fairness
> stuff that adds) works:
> 
> One or more userspace processes share address space, so they can both do
> simple atomic operations on the same memory (hence the new PROT_SEM flag to
> mmap/mprotect for architectures which need to know).  They agree that "this
> address contains an integer which we use as a mutex" (aka. struct futex).
> 
> The futex starts at 1 (available).  down() looks like:
> 	if (atomic_dec_and_test(futex)) return 0; /* We got it! */
> 	else sys_futex(futex, DOWN); /* Go to kernel, wait. */
> 
> up() looks like:
> 	if (atomic_inc_positive(futex)) return 0; /* Noone waiting */
> 	else sys_futex(futex, UP); /* go to kernel, wake people */
> 
> Inside the kernel, we do what you'd expect.  For sys_futex(futex, DOWN):
> 	Pin the page containing the futex, for convenience.
> 	Hash the kernel address of the futex to get a waitqueue.
> 	Add ourselves to the waitqueue.
> 	While !atomic_dec_and_test() (ie. futex still unavailable):
> 		sleep
> 		if signal pending, break;
> 	Unhash from waitqueue
> 	unpin page.
> 	return success or -EINTR.
> 
> For sys_futex(futex, UP):
> 	Pin page for convenience.
> 	Hash kernel address of futex to get the waitqueue.
> 	set futex to 1 (ie. available).
> 	Wake up the first one on the waitqueue waiting for this futex.
> 	unpin page
> 
> The only two twists to add are that we don't keep atomic_dec_and_test'ing
> in the loop forever (if it's already negative we don't bother), so counter
> doesn't wrap, and we don't use actual waitqueues because we share them, but
> we want to know which futex each waiter is waiting on.
> 
Okay, dumb question...
What can this do that shared memory + existing semaphores can't do?
	-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/