[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5d2ae241-a6ab-4cc4-b369-18860bdceb3c@efficios.com>
Date: Thu, 4 Sep 2025 13:19:54 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Thomas Gleixner <tglx@...utronix.de>, LKML <linux-kernel@...r.kernel.org>
Cc: Jens Axboe <axboe@...nel.dk>, Peter Zijlstra <peterz@...radead.org>,
"Paul E. McKenney" <paulmck@...nel.org>, Boqun Feng <boqun.feng@...il.com>,
Paolo Bonzini <pbonzini@...hat.com>, Sean Christopherson
<seanjc@...gle.com>, Wei Liu <wei.liu@...nel.org>,
Dexuan Cui <decui@...rosoft.com>, x86@...nel.org,
Arnd Bergmann <arnd@...db.de>, Heiko Carstens <hca@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>, Huacai Chen <chenhuacai@...nel.org>,
Paul Walmsley <paul.walmsley@...ive.com>, Palmer Dabbelt
<palmer@...belt.com>,
"linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
Madhavan Srinivasan <maddy@...ux.ibm.com>,
Michael Ellerman <mpe@...erman.id.au>, Nicholas Piggin <npiggin@...il.com>,
Christophe Leroy <christophe.leroy@...roup.eu>
Subject: Re: [patch V2 06/37] rseq: Simplify the event notification
On 2025-09-02 09:39, Thomas Gleixner wrote:
> On Mon, Aug 25 2025 at 13:36, Mathieu Desnoyers wrote:
>> On 2025-08-23 12:39, Thomas Gleixner wrote:
>>> Since commit 0190e4198e47 ("rseq: Deprecate RSEQ_CS_FLAG_NO_RESTART_ON_*
>>> flags") the bits in task::rseq_event_mask are meaningless and just extra
>>> work in terms of setting them individually.
>>>
>>> Aside of that the only relevant point where an event has to be raised is
>>> context switch. Neither the CPU nor MM CID can change without going through
>>> a context switch.
>>
>> Note: we may want to include the numa node id field as well in this
>> list of fields.
>
> What for? The node to CPU relationship is not magically changing, so you
> can't have a situation where the task stays on the same CPU and suddenly
> runs on a different node.
Good point. For the records, I suspect what I was confused about on the
powerpc side is that PAPR [1] allows the architecture to reconfigure
NUMA node to CPU mapping for virtualization use-cases, but AFAIU when
this happens the kernel will keep its own original mapping after the
CPUs were onlined at least once.
This would explain the purpose of the lparnumascore command that returns
a score estimating how much the kernel view of NUMA node to CPU mappings
differs from the current HW.
So AFAIU, from a kernel perspective, the NUMA node to CPU mapping is
invariant after it's been observed.
Out of curiosity, does the hwloc tool return the kernel's
CPU to node mapping as "logical" listing, and the PAPR CPU to node
mapping (which can change dynamically) as "physical" listing ?
[1] https://github.com/ibm-power-utilities/powerpc-utils
>
>>> - unsigned long rseq_event_mask;
>>> + bool rseq_event_pending;
>>
>> AFAIU, this rseq_event_pending field is now concurrently set from:
>>
>> - rseq_signal_deliver (without any preempt nor irqoff guard)
>> - rseq_sched_switch_event (with preemption disabled)
>>
>> Is it safe to concurrently store to a "bool" field within a structure
>> without any protection against concurrent stores ? Typically I've used
>> an integer field just to be on the safe side in that kind of situation.
>>
>> AFAIR, a bool type needs to be at least 1 byte. Do all architectures
>> supported by Linux have a single byte store instruction, or can we end
>> up incorrectly storing to other nearby fields ? (for instance, DEC
>> Alpha ?)
>
> All architectures which support RSEQ do and I really don't care about
> ALPHA, which has other problems than that.
OK.
Thanks!
Mathieu
>
> Thanks,
>
> tglx
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists