linux-kernel - Re: [PATCH 1/2] membarrier: allow cpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAO_YeogikhpZjg4Nhcdd0AKjRFCtZ4ohvVN5Y9DZgqmNiP8FRg@mail.gmail.com>
Date: Thu, 26 Jun 2025 19:30:23 +0100
From: Dylan <dyudaken@...il.com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: paulmck@...nel.org, mingo@...hat.com, peterz@...radead.org, 
	juri.lelli@...hat.com, vincent.guittot@...aro.org, dietmar.eggemann@....com, 
	rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de, vschneid@...hat.com, 
	shuah@...nel.org, linux-kernel@...r.kernel.org, 
	linux-kselftest@...r.kernel.org
Subject: Re: [PATCH 1/2] membarrier: allow cpu_id to be set on more commands

On Thu, Jun 26, 2025 at 5:07 PM Mathieu Desnoyers
<mathieu.desnoyers@...icios.com> wrote:
>
> On 2025-06-26 11:52, Dylan Yudaken wrote:
> > No reason to not allow MEMBARRIER_CMD_FLAG_CPU on
> > MEMBARRIER_CMD_PRIVATE_EXPEDITED or
> > MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE.
> >
> > If it is known specifically what cpu you want to interrupt then there
> > is a decent efficiency saving in not interrupting all the other ones.
> >
> > Also - the code already works as is for them.
>
> Can you elaborate on a concrete use-case justifying adding this ?
>
> Thanks,
>
> Mathieu
>

So my use case is for core-local data such as performance counters.

I have a library that allows a fast thread to  "lock" a core -> do
some work (probably incrementing some performance counters) -> unlock.
The "lock" uses restartable sequences (ie no serializing
instructions), and the unlock just writes a 0 to memory (again, no
serializing instructions).

A slow thread will occasionally (say every few minutes) try and read
data computed in the work section.
It does this by disabling locking and firing off a membarrier(RSEQ) on
that core to be sure that the core is either "locked" or "unlocked".
It then spins waiting for it to be unlocked.
At this point my understanding is a bit fuzzy - but I believe you need
that core to have a memory barrier since there is no serializing
instruction and the processor would happily reorder some "work" after
the "unlock" instruction.

That serializing instruction is what I want from this. But since I
know the cpu_id that I am working with I don't need to do a barrier on
_all_ the cores.

To be clear: (1) I don't have a current real world use case, and (2)
my library/design/understanding might be buggy.
(3) I don't have a use case for the SYNC_CORE part, but again it
seemed easy enough to add and I presume others might have a use case.