linux-kernel - Re: [patch V3 17/20] sched/mmcid: Provide CID ownership mode fixup functions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c2e4fed9-b207-4d28-93f5-b09f0fe78e35@efficios.com>
Date: Thu, 30 Oct 2025 11:51:03 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Thomas Gleixner <tglx@...utronix.de>, LKML <linux-kernel@...r.kernel.org>
Cc: Peter Zijlstra <peterz@...radead.org>,
 Gabriele Monaco <gmonaco@...hat.com>, Michael Jeanson
 <mjeanson@...icios.com>, Jens Axboe <axboe@...nel.dk>,
 "Paul E. McKenney" <paulmck@...nel.org>,
 "Gautham R. Shenoy" <gautham.shenoy@....com>,
 Florian Weimer <fweimer@...hat.com>, Tim Chen <tim.c.chen@...el.com>,
 Yury Norov <yury.norov@...il.com>, Shrikanth Hegde <sshegde@...ux.ibm.com>
Subject: Re: [patch V3 17/20] sched/mmcid: Provide CID ownership mode fixup
 functions

On 2025-10-29 09:09, Thomas Gleixner wrote:
> 
> At the point of switching to per CPU mode the new user is not yet visible
> in the system, so the task which initiated the fork() runs the fixup
> function: mm_cid_fixup_tasks_to_cpu() walks the thread list and either
> transfers each tasks owned CID to the CPU the task runs on or drops it into
> the CID pool if a task is not on a CPU at that point in time. Tasks which
> schedule in before the task walk reaches them do the handover in
> mm_cid_schedin(). When mm_cid_fixup_tasks_to_cpus() completes it's
> guaranteed that no task related to that MM owns a CID anymore.
> 
> Switching back to task mode happens when the user count goes below the
> threshold which was recorded on the per CPU mode switch:
> 
> 	pcpu_thrs = min(opt_cids - (opt_cids / 4), nr_cpu_ids / 2);
> 

AFAIU this provides an hysteresis so we don't switch back and
forth between modes if a single thread is forked/exits repeatedly,
right ?

> did not cover yet do the handover themself.

themselves

> 
> This transition from CPU to per task ownership happens in two phases:
> 
>   1) mm:mm_cid.transit contains MM_CID_TRANSIT. This is OR'ed on the task
>      CID and denotes that the CID is only temporarily owned by the
>      task. When it schedules out the task drops the CID back into the
>      pool if this bit is set.

OK, so the mm_drop_cid() on sched out only happens due to a transition
from per-cpu back to per-task. This answers my question in the previous
patch.

> 
>   2) The initiating context walks the per CPU space and after completion
>      clears mm:mm_cid.transit. After that point the CIDs are strictly
>      task owned again.
> 
> This two phase transition is required to prevent CID space exhaustion
> during the transition as a direct transfer of ownership would fail if
> two tasks are scheduled in on the same CPU before the fixup freed per
> CPU CIDs.

Clever. :-)

> + * Switching to per CPU mode happens when the user count becomes greater
> + * than the maximum number of CIDs, which is calculated by:
> + *
> + *	opt_cids = min(mm_cid::nr_cpus_allowed, mm_cid::users);
> + *	max_cids = min(1.25 * opt_cids, num_possible_cpus());
[...]
> + * Switching back to task mode happens when the user count goes below the
> + * threshold which was recorded on the per CPU mode switch:
> + *
> + *	pcpu_thrs = min(opt_cids - (opt_cids / 4), num_possible_cpus() / 2);

I notice that mm_update_cpus_allowed() calls __mm_update_max_cids() 
before updating the pcpu_thrs threshold.

sched_mm_cid_{add,remove}_user() only invoke mm_update_max_cids(mm)
without updating pcpu_thrs first.

Are those done on purpose ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com