lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <43ad1f76-3682-47f4-b3c9-62a94053db3c@efficios.com>
Date: Tue, 4 Nov 2025 15:17:17 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: linux-kernel@...r.kernel.org, lttng-dev@...ts.lttng.org
Cc: "Paul E. McKenney" <paulmck@...nel.org>,
 Alan Stern <stern@...land.harvard.edu>,
 Lai Jiangshan <jiangshanlai@...il.com>
Subject: [RELEASE] Userspace RCU 0.15.4

Hi,

This is a patchlevel release of the Userspace RCU library.

The most relevant change in this release is the removal of
a redundant memory barrier on x86 for store and RMW operations
with the CMM_SEQ_CST_FENCE memory ordering. This addresses
a performance regression for users of the pre-0.15 uatomic API
that build against a liburcu configured to use compiler builtins
for atomics (--enable-compiler-atomic-builtins).

As a reminder, the CMM_SEQ_CST_FENCE MO is a superset of SEQ_CST:
it provides sequential consistency _and_ acts as a full memory
barrier, similarly to the semantic associated with cmpxchg() and
atomic_add_return() within the LKMM.

Here is the rationale for this change:

/*
  * On x86, a atomic store with sequential consistency is always implemented with
  * an exchange operation, which has an implicit lock prefix when a memory operand
  * is used.
  *
  * Indeed, on x86, only loads can be re-ordered with prior stores. Therefore,
  * for keeping sequential consistency, either load operations or store
  * operations need to have a memory barrier. All major toolchains have selected
  * the store operations to have this barrier to avoid penalty on load
  * operations.
  *
  * Therefore, assuming that the used toolchain follows this convention, it is
  * safe to rely on this implicit memory barrier to implement the
  * `CMM_SEQ_CST_FENCE` memory order and thus no further barrier need to be
  * emitted.
  */
#define cmm_seq_cst_fence_after_atomic_store(...)       \
         do { } while (0)

/*
  * Let the default implementation (emit a memory barrier) after load operations
  * for the `CMM_SEQ_CST_FENCE`.  The rationale is explained above for
  * `cmm_seq_cst_fence_after_atomic_store()`.
  */
/* #define cmm_seq_cst_fence_after_atomic_load(...) */


/*
  * On x86, atomic read-modify-write operations always have a lock prefix either
  * implicitly or explicitly for sequential consistency.
  *
  * Therefore, no further memory barrier, for the `CMM_SEQ_CST_FENCE` memory
  * order, needs to be emitted for these operations.
  */
#define cmm_seq_cst_fence_after_atomic_rmw(...) \
         do { } while (0)

Changelog:

2025-11-04 Userspace RCU 0.15.4
         * uatomic: Fix redundant memory barriers for atomic builtin operations
         * Cleanup: Remove useless declarations from urcu-qsbr
         * src/urcu-bp.c: assert => urcu_posix_assert
         * ppc.h: improve ppc64 caa_get_cycles on Darwin

Thanks,

Mathieu

Project website: https://liburcu.org
Git repository: https://git.liburcu.org/userspace-rcu.git

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ