[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20220419000014.3948512-5-paulmck@kernel.org>
Date: Mon, 18 Apr 2022 17:00:10 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: rcu@...r.kernel.org
Cc: linux-kernel@...r.kernel.org, kernel-team@...com,
rostedt@...dmis.org,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Martin KaFai Lau <kafai@...com>,
Andrii Nakryiko <andrii@...nel.org>,
"Paul E . McKenney" <paulmck@...nel.org>
Subject: [PATCH rcu 5/9] rcu-tasks: Use schedule_hrtimeout_range() to wait for grace periods
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
The synchronous RCU-tasks grace-period-wait primitives invoke
schedule_timeout_idle() to give readers a chance to exit their
read-side critical sections. Unfortunately, this fails during early
boot on PREEMPT_RT because PREEMPT_RT relies solely on ksoftirqd to run
timer handlers. Because ksoftirqd cannot operate until its kthreads
are spawned, there is a brief period of time following scheduler
initialization where PREEMPT_RT cannot run the timer handlers that
schedule_timeout_idle() relies on, resulting in a hang.
To avoid this boot-time hang, this commit replaces schedule_timeout_idle()
with schedule_hrtimeout(), so that the timer expires in hardirq context.
This is ensures that the timer fires even on PREEMPT_RT throughout the
irqs-enabled portions of boot as well as during runtime.
The timer is set to expire between fract and fract + HZ / 2 jiffies in
order to align with any other timers that might expire during that time,
thus reducing the number of wakeups.
Note that RCU-tasks grace periods are infrequent, so the use of hrtimer
should be fine. In contrast, in common-case code, user of hrtimer
could result in performance issues.
Cc: Martin KaFai Lau <kafai@...com>
Cc: Andrii Nakryiko <andrii@...nel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
---
kernel/rcu/tasks.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 4b91cb214ca7..71fe340ab82a 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -647,13 +647,16 @@ static void rcu_tasks_wait_gp(struct rcu_tasks *rtp)
fract = rtp->init_fract;
while (!list_empty(&holdouts)) {
+ ktime_t exp;
bool firstreport;
bool needreport;
int rtst;
// Slowly back off waiting for holdouts
set_tasks_gp_state(rtp, RTGS_WAIT_SCAN_HOLDOUTS);
- schedule_timeout_idle(fract);
+ exp = jiffies_to_nsecs(fract);
+ __set_current_state(TASK_IDLE);
+ schedule_hrtimeout_range(&exp, jiffies_to_nsecs(HZ / 2), HRTIMER_MODE_REL_HARD);
if (fract < HZ)
fract++;
--
2.31.1.189.g2e36527f23
Powered by blists - more mailing lists