[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240324170619.545975-7-sashal@kernel.org>
Date: Sun, 24 Mar 2024 13:06:10 -0400
From: Sasha Levin <sashal@...nel.org>
To: linux-kernel@...r.kernel.org,
	stable@...r.kernel.org
Cc: "Paul E. McKenney" <paulmck@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Sebastian Siewior <bigeasy@...utronix.de>,
	Anna-Maria Behnsen <anna-maria@...utronix.de>,
	Steven Rostedt <rostedt@...dmis.org>,
	Boqun Feng <boqun.feng@...il.com>,
	Sasha Levin <sashal@...nel.org>,
	frederic@...nel.org,
	quic_neeraju@...cinc.com,
	joel@...lfernandes.org,
	josh@...htriplett.org,
	rcu@...r.kernel.org
Subject: [PATCH AUTOSEL 6.7 07/11] rcu-tasks: Maintain real-time response in rcu_tasks_postscan()
From: "Paul E. McKenney" <paulmck@...nel.org>
[ Upstream commit 0bb11a372fc8d7006b4d0f42a2882939747bdbff ]
The current code will scan the entirety of each per-CPU list of exiting
tasks in ->rtp_exit_list with interrupts disabled.  This is normally just
fine, because each CPU typically won't have very many tasks in this state.
However, if a large number of tasks block late in do_exit(), these lists
could be arbitrarily long.  Low probability, perhaps, but it really
could happen.
This commit therefore occasionally re-enables interrupts while traversing
these lists, inserting a dummy element to hold the current place in the
list.  In kernels built with CONFIG_PREEMPT_RT=y, this re-enabling happens
after each list element is processed, otherwise every one-to-two jiffies.
[ paulmck: Apply Frederic Weisbecker feedback. ]
Link: https://lore.kernel.org/all/ZdeI_-RfdLR8jlsm@localhost.localdomain/
Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Sebastian Siewior <bigeasy@...utronix.de>
Cc: Anna-Maria Behnsen <anna-maria@...utronix.de>
Cc: Steven Rostedt <rostedt@...dmis.org>
Signed-off-by: Boqun Feng <boqun.feng@...il.com>
Signed-off-by: Sasha Levin <sashal@...nel.org>
---
 kernel/rcu/tasks.h | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 774408899e715..4af68107544d3 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -971,13 +971,33 @@ static void rcu_tasks_postscan(struct list_head *hop)
 	 */
 
 	for_each_possible_cpu(cpu) {
+		unsigned long j = jiffies + 1;
 		struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rcu_tasks.rtpcpu, cpu);
 		struct task_struct *t;
+		struct task_struct *t1;
+		struct list_head tmp;
 
 		raw_spin_lock_irq_rcu_node(rtpcp);
-		list_for_each_entry(t, &rtpcp->rtp_exit_list, rcu_tasks_exit_list)
+		list_for_each_entry_safe(t, t1, &rtpcp->rtp_exit_list, rcu_tasks_exit_list) {
 			if (list_empty(&t->rcu_tasks_holdout_list))
 				rcu_tasks_pertask(t, hop);
+
+			// RT kernels need frequent pauses, otherwise
+			// pause at least once per pair of jiffies.
+			if (!IS_ENABLED(CONFIG_PREEMPT_RT) && time_before(jiffies, j))
+				continue;
+
+			// Keep our place in the list while pausing.
+			// Nothing else traverses this list, so adding a
+			// bare list_head is OK.
+			list_add(&tmp, &t->rcu_tasks_exit_list);
+			raw_spin_unlock_irq_rcu_node(rtpcp);
+			cond_resched(); // For CONFIG_PREEMPT=n kernels
+			raw_spin_lock_irq_rcu_node(rtpcp);
+			t1 = list_entry(tmp.next, struct task_struct, rcu_tasks_exit_list);
+			list_del(&tmp);
+			j = jiffies + 1;
+		}
 		raw_spin_unlock_irq_rcu_node(rtpcp);
 	}
 
-- 
2.43.0
Powered by blists - more mailing lists
 
