lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200827190804.GA128237@debian-boqun.qqnc3lrjykvubdpftowmye0fmh.lx.internal.cloudapp.net>
Date:   Fri, 28 Aug 2020 03:08:04 +0800
From:   Boqun Feng <boqun.feng@...il.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel@...r.kernel.org, mhiramat@...nel.org,
        Eddy_Wu@...ndmicro.com, x86@...nel.org, davem@...emloft.net,
        rostedt@...dmis.org, naveen.n.rao@...ux.ibm.com,
        anil.s.keshavamurthy@...el.com, linux-arch@...r.kernel.org,
        cameron@...dycamel.com, oleg@...hat.com, will@...nel.org,
        paulmck@...nel.org
Subject: Re: [RFC][PATCH 6/7] freelist: Lock less freelist

On Thu, Aug 27, 2020 at 06:12:43PM +0200, Peter Zijlstra wrote:
> 
> 
> Cc: cameron@...dycamel.com
> Cc: oleg@...hat.com
> Cc: will@...nel.org
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> ---
>  include/linux/freelist.h |  129 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 129 insertions(+)
> 
> --- /dev/null
> +++ b/include/linux/freelist.h
> @@ -0,0 +1,129 @@
> +// SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
> +#ifndef FREELIST_H
> +#define FREELIST_H
> +
> +#include <linux/atomic.h>
> +
> +/*
> + * Copyright: cameron@...dycamel.com
> + *
> + * A simple CAS-based lock-free free list. Not the fastest thing in the world
> + * under heavy contention, but simple and correct (assuming nodes are never
> + * freed until after the free list is destroyed), and fairly speedy under low
> + * contention.
> + *
> + * Adapted from: https://moodycamel.com/blog/2014/solving-the-aba-problem-for-lock-free-free-lists
> + */
> +
> +struct freelist_node {
> +	atomic_t		refs;
> +	struct freelist_node	*next;
> +};
> +
> +struct freelist_head {
> +	struct freelist_node	*head;
> +};
> +
> +#define REFS_ON_FREELIST 0x80000000
> +#define REFS_MASK	 0x7FFFFFFF
> +
> +static inline void __freelist_add(struct freelist_node *node, struct freelist_head *list)
> +{
> +	/*
> +	 * Since the refcount is zero, and nobody can increase it once it's
> +	 * zero (except us, and we run only one copy of this method per node at
> +	 * a time, i.e. the single thread case), then we know we can safely
> +	 * change the next pointer of the node; however, once the refcount is
> +	 * back above zero, then other threads could increase it (happens under
> +	 * heavy contention, when the refcount goes to zero in between a load
> +	 * and a refcount increment of a node in try_get, then back up to
> +	 * something non-zero, then the refcount increment is done by the other
> +	 * thread) -- so if the CAS to add the node to the actual list fails,
> +	 * decrese the refcount and leave the add operation to the next thread
> +	 * who puts the refcount back to zero (which could be us, hence the
> +	 * loop).
> +	 */
> +	struct freelist_node *head = READ_ONCE(list->head);
> +
> +	for (;;) {
> +		WRITE_ONCE(node->next, head);
> +		atomic_set_release(&node->refs, 1);
> +
> +		if (!try_cmpxchg_release(&list->head, &head, node)) {
> +			/*
> +			 * Hmm, the add failed, but we can only try again when
> +			 * the refcount goes back to zero.
> +			 */
> +			if (atomic_fetch_add_release(REFS_ON_FREELIST - 1, &node->refs) == 1)
> +				continue;
> +		}
> +		return;
> +	}
> +}
> +
> +static inline void freelist_add(struct freelist_node *node, struct freelist_head *list)
> +{
> +	/*
> +	 * We know that the should-be-on-freelist bit is 0 at this point, so
> +	 * it's safe to set it using a fetch_add.
> +	 */
> +	if (!atomic_fetch_add_release(REFS_ON_FREELIST, &node->refs)) {
> +		/*
> +		 * Oh look! We were the last ones referencing this node, and we
> +		 * know we want to add it to the free list, so let's do it!
> +		 */
> +		__freelist_add(node, list);
> +	}
> +}
> +
> +static inline struct freelist_node *freelist_try_get(struct freelist_head *list)
> +{
> +	struct freelist_node *prev, *next, *head = smp_load_acquire(&list->head);
> +	unsigned int refs;
> +
> +	while (head) {
> +		prev = head;
> +		refs = atomic_read(&head->refs);
> +		if ((refs & REFS_MASK) == 0 ||
> +		    !atomic_try_cmpxchg_acquire(&head->refs, &refs, refs+1)) {
> +			head = smp_load_acquire(&list->head);
> +			continue;
> +		}
> +
> +		/*
> +		 * Good, reference count has been incremented (it wasn't at
> +		 * zero), which means we can read the next and not worry about
> +		 * it changing between now and the time we do the CAS.
> +		 */
> +		next = READ_ONCE(head->next);
> +		if (try_cmpxchg_acquire(&list->head, &head, next)) {

So if try_cmpxchg_acquire() fails, we don't have ACQUIRE semantics on
read of the new list->head, right? Then probably a
smp_mb__after_atomic() is needed in that case?

Regards,
Boqun

> +			/*
> +			 * Yay, got the node. This means it was on the list,
> +			 * which means should-be-on-freelist must be false no
> +			 * matter the refcount (because nobody else knows it's
> +			 * been taken off yet, it can't have been put back on).
> +			 */
> +			WARN_ON_ONCE(atomic_read(&head->refs) & REFS_ON_FREELIST);
> +
> +			/*
> +			 * Decrease refcount twice, once for our ref, and once
> +			 * for the list's ref.
> +			 */
> +			atomic_fetch_add(-2, &head->refs);
> +
> +			return head;
> +		}
> +
> +		/*
> +		 * OK, the head must have changed on us, but we still need to decrement
> +		 * the refcount we increased.
> +		 */
> +		refs = atomic_fetch_add(-1, &prev->refs);
> +		if (refs == REFS_ON_FREELIST + 1)
> +			__freelist_add(prev, list);
> +	}
> +
> +	return NULL;
> +}
> +
> +#endif /* FREELIST_H */
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ