[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241009125127.18902-9-neeraj.upadhyay@kernel.org>
Date: Wed, 9 Oct 2024 18:21:25 +0530
From: neeraj.upadhyay@...nel.org
To: rcu@...r.kernel.org
Cc: linux-kernel@...r.kernel.org,
paulmck@...nel.org,
joel@...lfernandes.org,
frederic@...nel.org,
boqun.feng@...il.com,
urezki@...il.com,
rostedt@...dmis.org,
mathieu.desnoyers@...icios.com,
jiangshanlai@...il.com,
qiang.zhang1211@...il.com,
peterz@...radead.org,
neeraj.upadhyay@....com,
Neeraj Upadhyay <neeraj.upadhyay@...nel.org>
Subject: [PATCH v2 08/10] rcu/tasks: Make RCU-tasks pay attention to idle tasks
From: Neeraj Upadhyay <neeraj.upadhyay@...nel.org>
Currently, idle tasks are ignored by RCU-tasks. Change this to
start paying attention to idle tasks except in deep-idle functions
where RCU is not watching. With this, for architectures where
kernel entry/exit and deep-idle functions have been properly tagged
noinstr, Tasks Rude RCU can be disabled.
[ neeraj.upadhyay: Frederic Weisbecker and Paul E. McKenney feedback. ]
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@...nel.org>
---
.../RCU/Design/Requirements/Requirements.rst | 12 +++---
kernel/rcu/tasks.h | 41 ++++++++-----------
2 files changed, 24 insertions(+), 29 deletions(-)
diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
index 6125e7068d2c..5016b85d53d7 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.rst
+++ b/Documentation/RCU/Design/Requirements/Requirements.rst
@@ -2611,8 +2611,8 @@ critical sections that are delimited by voluntary context switches, that
is, calls to schedule(), cond_resched(), and
synchronize_rcu_tasks(). In addition, transitions to and from
userspace execution also delimit tasks-RCU read-side critical sections.
-Idle tasks are ignored by Tasks RCU, and Tasks Rude RCU may be used to
-interact with them.
+Idle tasks which are idle from RCU's perspective are ignored by Tasks RCU,
+and Tasks Rude RCU may be used to interact with them.
Note well that involuntary context switches are *not* Tasks-RCU quiescent
states. After all, in preemptible kernels, a task executing code in a
@@ -2643,10 +2643,10 @@ moniker. And this operation is considered to be quite rude by real-time
workloads that don't want their ``nohz_full`` CPUs receiving IPIs and
by battery-powered systems that don't want their idle CPUs to be awakened.
-Once kernel entry/exit and deep-idle functions have been properly tagged
-``noinstr``, Tasks RCU can start paying attention to idle tasks (except
-those that are idle from RCU's perspective) and then Tasks Rude RCU can
-be removed from the kernel.
+As Tasks RCU now pays attention to idle tasks (except those that are idle
+from RCU's perspective), once kernel entry/exit and deep-idle functions have
+been properly tagged ``noinstr``, Tasks Rude RCU can be removed from the
+kernel.
The tasks-rude-RCU API is also reader-marking-free and thus quite compact,
consisting solely of synchronize_rcu_tasks_rude().
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 1947f9b6346d..72dc0d0a4a8f 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -912,14 +912,15 @@ static void rcu_tasks_wait_gp(struct rcu_tasks *rtp)
////////////////////////////////////////////////////////////////////////
//
// Simple variant of RCU whose quiescent states are voluntary context
-// switch, cond_resched_tasks_rcu_qs(), user-space execution, and idle.
-// As such, grace periods can take one good long time. There are no
-// read-side primitives similar to rcu_read_lock() and rcu_read_unlock()
-// because this implementation is intended to get the system into a safe
-// state for some of the manipulations involved in tracing and the like.
-// Finally, this implementation does not support high call_rcu_tasks()
-// rates from multiple CPUs. If this is required, per-CPU callback lists
-// will be needed.
+// switch, cond_resched_tasks_rcu_qs(), user-space execution, and idle
+// tasks which are in RCU-idle context. As such, grace periods can take
+// one good long time. There are no read-side primitives similar to
+// rcu_read_lock() and rcu_read_unlock() because this implementation is
+// intended to get the system into a safe state for some of the
+// manipulations involved in tracing and the like. Finally, this
+// implementation does not support high call_rcu_tasks() rates from
+// multiple CPUs. If this is required, per-CPU callback lists will be
+// needed.
//
// The implementation uses rcu_tasks_wait_gp(), which relies on function
// pointers in the rcu_tasks structure. The rcu_spawn_tasks_kthread()
@@ -1079,14 +1080,6 @@ static bool rcu_tasks_is_holdout(struct task_struct *t)
if (!READ_ONCE(t->on_rq))
return false;
- /*
- * Idle tasks (or idle injection) within the idle loop are RCU-tasks
- * quiescent states. But CPU boot code performed by the idle task
- * isn't a quiescent state.
- */
- if (is_idle_task(t))
- return false;
-
cpu = task_cpu(t);
if (t == idle_task(cpu))
@@ -1265,11 +1258,12 @@ static void tasks_rcu_exit_srcu_stall(struct timer_list *unused)
* period elapses, in other words after all currently executing RCU
* read-side critical sections have completed. call_rcu_tasks() assumes
* that the read-side critical sections end at a voluntary context
- * switch (not a preemption!), cond_resched_tasks_rcu_qs(), entry into idle,
- * or transition to usermode execution. As such, there are no read-side
- * primitives analogous to rcu_read_lock() and rcu_read_unlock() because
- * this primitive is intended to determine that all tasks have passed
- * through a safe state, not so much for data-structure synchronization.
+ * switch (not a preemption!), cond_resched_tasks_rcu_qs(), entry into
+ * RCU-idle context or transition to usermode execution. As such, there
+ * are no read-side primitives analogous to rcu_read_lock() and
+ * rcu_read_unlock() because this primitive is intended to determine
+ * that all tasks have passed through a safe state, not so much for
+ * data-structure synchronization.
*
* See the description of call_rcu() for more detailed information on
* memory ordering guarantees.
@@ -1287,8 +1281,9 @@ EXPORT_SYMBOL_GPL(call_rcu_tasks);
* grace period has elapsed, in other words after all currently
* executing rcu-tasks read-side critical sections have elapsed. These
* read-side critical sections are delimited by calls to schedule(),
- * cond_resched_tasks_rcu_qs(), idle execution, userspace execution, calls
- * to synchronize_rcu_tasks(), and (in theory, anyway) cond_resched().
+ * cond_resched_tasks_rcu_qs(), idle execution within RCU-idle context,
+ * userspace execution, calls to synchronize_rcu_tasks(), and (in theory,
+ * anyway) cond_resched().
*
* This is a very specialized primitive, intended only for a few uses in
* tracing and other situations requiring manipulation of function
--
2.40.1
Powered by blists - more mailing lists