linux-kernel - Re: [patch V3 00/12] rseq: Implement time slice extension mechanism

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3a0f1467-7fff-48c6-b0d1-772917cc6143@efficios.com>
Date: Wed, 12 Nov 2025 15:46:25 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Thomas Gleixner <tglx@...utronix.de>,
 Prakash Sangappa <prakash.sangappa@...cle.com>
Cc: LKML <linux-kernel@...r.kernel.org>, Peter Zijlstra
 <peterz@...radead.org>, "Paul E. McKenney" <paulmck@...nel.org>,
 Boqun Feng <boqun.feng@...il.com>, Jonathan Corbet <corbet@....net>,
 Madadi Vineeth Reddy <vineethr@...ux.ibm.com>,
 K Prateek Nayak <kprateek.nayak@....com>,
 Steven Rostedt <rostedt@...dmis.org>,
 Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
 Arnd Bergmann <arnd@...db.de>,
 "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>
Subject: Re: [patch V3 00/12] rseq: Implement time slice extension mechanism

On 2025-11-12 15:31, Thomas Gleixner wrote:
> On Tue, Nov 11 2025 at 11:42, Mathieu Desnoyers wrote:
>> On 2025-11-10 09:23, Mathieu Desnoyers wrote:
>> I've spent some time digging through Thomas' implementation of
>> mm_cid management. I've spotted something which may explain
>> the watchdog panic. Here is the scenario:
>>
>> 1) A process is constrained to a subset of the possible CPUs,
>>      and has enough threads to swap from per-thread to per-cpu mm_cid
>>      mode. It runs happily in that per-cpu mode.
>>
>> 2) The number of allowed CPUs is increased for a process, thus invoking
>>      mm_update_cpus_allowed. This switches the mode back to per-thread,
>>      but delays invocation of mm_cid_work_fn to some point in the future,
>>      in thread context, through irq_work + schedule_work.
>>
>>      At that point, because only __mm_update_max_cids was called by
>>      mm_update_cpus_allowed, the max_cids is updated, but mc->transit
>>      is still zero.
>>
>>      Also, until mm_cid_fixup_cpus_to_tasks is invoked by either the
>>      scheduled work or near the end of sched_mm_cid_fork, or by
>>      sched_mm_cid_exit, we are in a state where mm_cids are still
>>      owned by CPUs, but we are now in per-thread mm_cid mode, which
>>      means that the mc->max_cids value depends on the number of threads.
> 
> No. It stays in per CPU mode. The mode switch itself happens either in
> the worker or on fork/exit whatever comes first.

Ah, that's what I missed. All good then.

[...]

> 
> There was an issue in V3 with the not-initialized transit member and a
> off by one in one of the transition functions. It's fixed in the git
> tree, but I haven't posted it yet because I was AFK for a week.
> 
> I did not notice the V3 issue because tests passed on a small machine,
> but after I did a rebase to the tip rseq and uaccess bits, I noticed the
> failure because I tested on a larger box.

Good ! We'll see if this fixes the issue observed by Prakash. If not,
I'm curious to validate that num_possible_cpus() is always set to its
final value before _any_ mm is created.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com