[<prev] [next>] [day] [month] [year] [list]
Message-ID: <43ad1f76-3682-47f4-b3c9-62a94053db3c@efficios.com>
Date: Tue, 4 Nov 2025 15:17:17 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: linux-kernel@...r.kernel.org, lttng-dev@...ts.lttng.org
Cc: "Paul E. McKenney" <paulmck@...nel.org>,
Alan Stern <stern@...land.harvard.edu>,
Lai Jiangshan <jiangshanlai@...il.com>
Subject: [RELEASE] Userspace RCU 0.15.4
Hi,
This is a patchlevel release of the Userspace RCU library.
The most relevant change in this release is the removal of
a redundant memory barrier on x86 for store and RMW operations
with the CMM_SEQ_CST_FENCE memory ordering. This addresses
a performance regression for users of the pre-0.15 uatomic API
that build against a liburcu configured to use compiler builtins
for atomics (--enable-compiler-atomic-builtins).
As a reminder, the CMM_SEQ_CST_FENCE MO is a superset of SEQ_CST:
it provides sequential consistency _and_ acts as a full memory
barrier, similarly to the semantic associated with cmpxchg() and
atomic_add_return() within the LKMM.
Here is the rationale for this change:
/*
* On x86, a atomic store with sequential consistency is always implemented with
* an exchange operation, which has an implicit lock prefix when a memory operand
* is used.
*
* Indeed, on x86, only loads can be re-ordered with prior stores. Therefore,
* for keeping sequential consistency, either load operations or store
* operations need to have a memory barrier. All major toolchains have selected
* the store operations to have this barrier to avoid penalty on load
* operations.
*
* Therefore, assuming that the used toolchain follows this convention, it is
* safe to rely on this implicit memory barrier to implement the
* `CMM_SEQ_CST_FENCE` memory order and thus no further barrier need to be
* emitted.
*/
#define cmm_seq_cst_fence_after_atomic_store(...) \
do { } while (0)
/*
* Let the default implementation (emit a memory barrier) after load operations
* for the `CMM_SEQ_CST_FENCE`. The rationale is explained above for
* `cmm_seq_cst_fence_after_atomic_store()`.
*/
/* #define cmm_seq_cst_fence_after_atomic_load(...) */
/*
* On x86, atomic read-modify-write operations always have a lock prefix either
* implicitly or explicitly for sequential consistency.
*
* Therefore, no further memory barrier, for the `CMM_SEQ_CST_FENCE` memory
* order, needs to be emitted for these operations.
*/
#define cmm_seq_cst_fence_after_atomic_rmw(...) \
do { } while (0)
Changelog:
2025-11-04 Userspace RCU 0.15.4
* uatomic: Fix redundant memory barriers for atomic builtin operations
* Cleanup: Remove useless declarations from urcu-qsbr
* src/urcu-bp.c: assert => urcu_posix_assert
* ppc.h: improve ppc64 caa_get_cycles on Darwin
Thanks,
Mathieu
Project website: https://liburcu.org
Git repository: https://git.liburcu.org/userspace-rcu.git
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists