lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20231118155105.25678-5-yury.norov@gmail.com>
Date:   Sat, 18 Nov 2023 07:50:35 -0800
From:   Yury Norov <yury.norov@...il.com>
To:     linux-kernel@...r.kernel.org, Yury Norov <yury.norov@...il.com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Rasmus Villemoes <linux@...musvillemoes.dk>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Valentin Schneider <vschneid@...hat.com>
Cc:     Jan Kara <jack@...e.cz>,
        Mirsad Todorovac <mirsad.todorovac@....unizg.hr>,
        Matthew Wilcox <willy@...radead.org>,
        Maxim Kuvyrkov <maxim.kuvyrkov@...aro.org>,
        Alexey Klimov <klimov.linux@...il.com>
Subject: [PATCH 04/34] sched: add cpumask_find_and_set() and use it in __mm_cid_get()

__mm_cid_get() uses a __mm_cid_try_get() helper to atomically acquire a
bit in mm cid mask. Now that we have atomic find_and_set_bit(), we can
easily extend it to cpumasks and use in the scheduler code.

__mm_cid_try_get() has an infinite loop, which may delay forward
progress of __mm_cid_get() when the mask is dense. The
cpumask_find_and_set() doesn't poll the mask infinitely, and returns as
soon as nothing has found after the first iteration, allowing to acquire
the lock, and set use_cid_lock faster, if needed.

cpumask_find_and_set() considers cid mask as a volatile region of memory,
as it actually is in this case. So, if it's changed while search is in
progress, KCSAN wouldn't fire warning on it.

Signed-off-by: Yury Norov <yury.norov@...il.com>
---
 include/linux/cpumask.h | 12 ++++++++++
 kernel/sched/sched.h    | 52 ++++++++++++-----------------------------
 2 files changed, 27 insertions(+), 37 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index cfb545841a2c..c2acced8be4e 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -271,6 +271,18 @@ unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
 		small_cpumask_bits, n + 1);
 }
 
+/**
+ * cpumask_find_and_set - find the first unset cpu in a cpumask and
+ *			  set it atomically
+ * @srcp: the cpumask pointer
+ *
+ * Return: >= nr_cpu_ids if nothing is found.
+ */
+static inline unsigned int cpumask_find_and_set(volatile struct cpumask *srcp)
+{
+	return find_and_set_bit(cpumask_bits(srcp), small_cpumask_bits);
+}
+
 /**
  * for_each_cpu - iterate over every cpu in a mask
  * @cpu: the (optionally unsigned) integer iterator
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 2e5a95486a42..b2f095a9fc40 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -3345,28 +3345,6 @@ static inline void mm_cid_put(struct mm_struct *mm)
 	__mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
 }
 
-static inline int __mm_cid_try_get(struct mm_struct *mm)
-{
-	struct cpumask *cpumask;
-	int cid;
-
-	cpumask = mm_cidmask(mm);
-	/*
-	 * Retry finding first zero bit if the mask is temporarily
-	 * filled. This only happens during concurrent remote-clear
-	 * which owns a cid without holding a rq lock.
-	 */
-	for (;;) {
-		cid = cpumask_first_zero(cpumask);
-		if (cid < nr_cpu_ids)
-			break;
-		cpu_relax();
-	}
-	if (cpumask_test_and_set_cpu(cid, cpumask))
-		return -1;
-	return cid;
-}
-
 /*
  * Save a snapshot of the current runqueue time of this cpu
  * with the per-cpu cid value, allowing to estimate how recently it was used.
@@ -3381,25 +3359,25 @@ static inline void mm_cid_snapshot_time(struct rq *rq, struct mm_struct *mm)
 
 static inline int __mm_cid_get(struct rq *rq, struct mm_struct *mm)
 {
+	struct cpumask *cpumask = mm_cidmask(mm);
 	int cid;
 
-	/*
-	 * All allocations (even those using the cid_lock) are lock-free. If
-	 * use_cid_lock is set, hold the cid_lock to perform cid allocation to
-	 * guarantee forward progress.
-	 */
+	/* All allocations (even those using the cid_lock) are lock-free. */
 	if (!READ_ONCE(use_cid_lock)) {
-		cid = __mm_cid_try_get(mm);
-		if (cid >= 0)
+		cid = cpumask_find_and_set(cpumask);
+		if (cid < nr_cpu_ids)
 			goto end;
-		raw_spin_lock(&cid_lock);
-	} else {
-		raw_spin_lock(&cid_lock);
-		cid = __mm_cid_try_get(mm);
-		if (cid >= 0)
-			goto unlock;
 	}
 
+	/*
+	 * If use_cid_lock is set, hold the cid_lock to perform cid
+	 * allocation to guarantee forward progress.
+	 */
+	raw_spin_lock(&cid_lock);
+	cid = cpumask_find_and_set(cpumask);
+	if (cid < nr_cpu_ids)
+		goto unlock;
+
 	/*
 	 * cid concurrently allocated. Retry while forcing following
 	 * allocations to use the cid_lock to ensure forward progress.
@@ -3415,9 +3393,9 @@ static inline int __mm_cid_get(struct rq *rq, struct mm_struct *mm)
 	 * all newcoming allocations observe the use_cid_lock flag set.
 	 */
 	do {
-		cid = __mm_cid_try_get(mm);
+		cid = cpumask_find_and_set(cpumask);
 		cpu_relax();
-	} while (cid < 0);
+	} while (cid >= nr_cpu_ids);
 	/*
 	 * Allocate before clearing use_cid_lock. Only care about
 	 * program order because this is for forward progress.
-- 
2.39.2

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ