lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1386696996.2731.18.camel@buesod1.americas.hpqcorp.net>
Date:	Tue, 10 Dec 2013 09:36:36 -0800
From:	Davidlohr Bueso <davidlohr@...com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	linux-kernel@...r.kernel.org, mingo@...nel.org,
	dvhart@...ux.intel.com, tglx@...utronix.de,
	paulmck@...ux.vnet.ibm.com, efault@....de, jeffm@...e.com,
	torvalds@...ux-foundation.org, scott.norton@...com,
	tom.vaden@...com, aswin@...com, Waiman.Long@...com,
	jason.low2@...com
Subject: Re: [PATCH v2 4/4] futex: Avoid taking hb lock if nothing to wakeup

On Tue, 2013-12-10 at 18:15 +0100, Peter Zijlstra wrote:
> ---
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -82,12 +82,13 @@
>   * The waker side modifies the user space value of the futex and calls
>   * futex_wake(). It computes the hash bucket and acquires the hash
>   * bucket lock. Then it looks for waiters on that futex in the hash
> - * bucket and wakes them. In scenarios where wakeups are called and no
> - * tasks are blocked on a futex, taking the hb spinlock can be avoided
> - * and simply return. In order for this optimization to work, ordering
> - * guarantees must exist so that the waiter being added to the list is
> - * acknowledged when the list is concurrently being checked by the waker,
> - * avoiding scenarios like the following:
> + * bucket and wakes them.
> + *
> + * In scenarios where wakeups are called and no tasks are blocked on a futex,
> + * taking the hb spinlock can be avoided and simply return. In order for this
> + * optimization to work, ordering guarantees must exist so that the waiter
> + * being added to the list is acknowledged when the list is concurrently being
> + * checked by the waker, avoiding scenarios like the following:
>   *
>   * CPU 0                               CPU 1
>   * val = *futex;
> @@ -108,6 +109,7 @@
>   * This would cause the waiter on CPU 0 to wait forever because it
>   * missed the transition of the user space value from val to newval
>   * and the waker did not find the waiter in the hash bucket queue.
> + *
>   * The correct serialization ensures that a waiter either observes
>   * the changed user space value before blocking or is woken by a
>   * concurrent waker:
> @@ -117,7 +119,8 @@
>   * sys_futex(WAIT, futex, val);
>   *   futex_wait(futex, val);
>   *
> - *   mb(); <-- paired with ------
> + *   waiters++;
> + *   mb(); (A) <-- paired with -.
>   *                              |
>   *   lock(hash_bucket(futex));  |
>   *                              |
> @@ -126,22 +129,29 @@
>   *                              |        sys_futex(WAKE, futex);
>   *                              |          futex_wake(futex);
>   *                              |
> - *                              -------->   mb();
> + *                              `------->   mb(); (B)
>   *   if (uval == val)
>   *     queue();
>   *     unlock(hash_bucket(futex));
> - *     schedule();                         if (!queue_empty())
> + *     schedule();                         if (waiters)
>   *                                           lock(hash_bucket(futex));
>   *                                           wake_waiters(futex);
>   *                                           unlock(hash_bucket(futex));
>   *
> - * The length of the list is tracked with atomic ops (hb->waiters),
> - * providing the necessary memory barriers for the waiters. For the
> - * waker side, however, we rely on get_futex_key_refs(), using either
> - * ihold() or the atomic_inc(), for shared futexes. The former provides
> - * a full mb on all architectures. For architectures that do not have an
> - * implicit barrier in atomic_inc/dec, we explicitly add it - please
> - * refer to futex_get_mm() and hb_waiters_inc/dec().

IMHO this text gives a nice summary instead of documenting each function
with this things like '... implies MB (B)'. Anyway, I'll resend this
patch with your corrections.

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ