lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <177021163876.2495410.14698078484542619057.tip-bot2@tip-bot2>
Date: Wed, 04 Feb 2026 13:27:18 -0000
From: "tip-bot2 for Thomas Gleixner" <tip-bot2@...utronix.de>
To: linux-tip-commits@...r.kernel.org
Cc: Thomas Gleixner <tglx@...utronix.de>,
 "Peter Zijlstra (Intel)" <peterz@...radead.org>,
 Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, x86@...nel.org,
 linux-kernel@...r.kernel.org
Subject: [tip: sched/urgent] sched/mmcid: Optimize transitional CIDs when
 scheduling out

The following commit has been merged into the sched/urgent branch of tip:

Commit-ID:     4463c7aa11a6e67169ae48c6804968960c4bffea
Gitweb:        https://git.kernel.org/tip/4463c7aa11a6e67169ae48c6804968960c4bffea
Author:        Thomas Gleixner <tglx@...nel.org>
AuthorDate:    Mon, 02 Feb 2026 10:39:55 +01:00
Committer:     Peter Zijlstra <peterz@...radead.org>
CommitterDate: Wed, 04 Feb 2026 12:21:12 +01:00

sched/mmcid: Optimize transitional CIDs when scheduling out

During the investigation of the various transition mode issues
instrumentation revealed that the amount of bitmap operations can be
significantly reduced when a task with a transitional CID schedules out
after the fixup function completed and disabled the transition mode.

At that point the mode is stable and therefore it is not required to drop
the transitional CID back into the pool. As the fixup is complete the
potential exhaustion of the CID pool is not longer possible, so the CID can
be transferred to the scheduling out task or to the CPU depending on the
current ownership mode.

The racy snapshot of mm_cid::mode which contains both the ownership state
and the transition bit is valid because runqueue lock is held and the fixup
function of a concurrent mode switch is serialized.

Assigning the ownership right there not only spares the bitmap access for
dropping the CID it also avoids it when the task is scheduled back in as it
directly hits the fast path in both modes when the CID is within the
optimal range. If it's outside the range the next schedule in will need to
converge so dropping it right away is sensible. In the good case this also
allows to go into the fast path on the next schedule in operation.

With a thread pool benchmark which is configured to cross the mode switch
boundaries frequently this reduces the number of bitmap operations by about
30% and increases the fastpath utilization in the low single digit
percentage range.

Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Link: https://patch.msgid.link/20260201192835.100194627@kernel.org
---
 kernel/sched/sched.h | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index f85fd6b..bd350e4 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -3902,12 +3902,31 @@ static __always_inline void mm_cid_schedin(struct task_struct *next)
 
 static __always_inline void mm_cid_schedout(struct task_struct *prev)
 {
+	struct mm_struct *mm = prev->mm;
+	unsigned int mode, cid;
+
 	/* During mode transitions CIDs are temporary and need to be dropped */
 	if (likely(!cid_in_transit(prev->mm_cid.cid)))
 		return;
 
-	mm_drop_cid(prev->mm, cid_from_transit_cid(prev->mm_cid.cid));
-	prev->mm_cid.cid = MM_CID_UNSET;
+	mode = READ_ONCE(mm->mm_cid.mode);
+	cid = cid_from_transit_cid(prev->mm_cid.cid);
+
+	/*
+	 * If transition mode is done, transfer ownership when the CID is
+	 * within the convergence range to optimize the next schedule in.
+	 */
+	if (!cid_in_transit(mode) && cid < READ_ONCE(mm->mm_cid.max_cids)) {
+		if (cid_on_cpu(mode))
+			cid = cid_to_cpu_cid(cid);
+
+		/* Update both so that the next schedule in goes into the fast path */
+		mm_cid_update_pcpu_cid(mm, cid);
+		prev->mm_cid.cid = cid;
+	} else {
+		mm_drop_cid(mm, cid);
+		prev->mm_cid.cid = MM_CID_UNSET;
+	}
 }
 
 static inline void mm_cid_switch_to(struct task_struct *prev, struct task_struct *next)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ