[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20231107230822.371443-11-ankur.a.arora@oracle.com>
Date: Tue, 7 Nov 2023 15:08:03 -0800
From: Ankur Arora <ankur.a.arora@...cle.com>
To: linux-kernel@...r.kernel.org
Cc: tglx@...utronix.de, peterz@...radead.org,
torvalds@...ux-foundation.org, paulmck@...nel.org,
linux-mm@...ck.org, x86@...nel.org, akpm@...ux-foundation.org,
luto@...nel.org, bp@...en8.de, dave.hansen@...ux.intel.com,
hpa@...or.com, mingo@...hat.com, juri.lelli@...hat.com,
vincent.guittot@...aro.org, willy@...radead.org, mgorman@...e.de,
jon.grimm@....com, bharata@....com, raghavendra.kt@....com,
boris.ostrovsky@...cle.com, konrad.wilk@...cle.com,
jgross@...e.com, andrew.cooper3@...rix.com, mingo@...nel.org,
bristot@...nel.org, mathieu.desnoyers@...icios.com,
geert@...ux-m68k.org, glaubitz@...sik.fu-berlin.de,
anton.ivanov@...bridgegreys.com, mattst88@...il.com,
krypton@...ich-teichert.org, rostedt@...dmis.org,
David.Laight@...LAB.COM, richard@....at, mjguzik@...il.com,
Ankur Arora <ankur.a.arora@...cle.com>,
Tejun Heo <tj@...nel.org>,
Lai Jiangshan <jiangshanlai@...il.com>,
Nicholas Piggin <npiggin@...il.com>
Subject: [RFC PATCH 67/86] treewide: kernel: remove cond_reshed()
There are broadly three sets of uses of cond_resched():
1. Calls to cond_resched() out of the goodness of our heart,
otherwise known as avoiding lockup splats.
2. Open coded variants of cond_resched_lock() which call
cond_resched().
3. Retry or error handling loops, where cond_resched() is used as a
quick alternative to spinning in a tight-loop.
When running under a full preemption model, the cond_resched() reduces
to a NOP (not even a barrier) so removing it obviously cannot matter.
But considering only voluntary preemption models (for say code that
has been mostly tested under those), for set-1 and set-2 the
scheduler can now preempt kernel tasks running beyond their time
quanta anywhere they are preemptible() [1]. Which removes any need
for these explicitly placed scheduling points.
The cond_resched() calls in set-3 are a little more difficult.
To start with, given it's NOP character under full preemption, it
never actually saved us from a tight loop.
With voluntary preemption, it's not a NOP, but it might as well be --
for most workloads the scheduler does not have an interminable supply
of runnable tasks on the runqueue.
So, cond_resched() is useful to not get softlockup splats, but not
terribly good for error handling. Ideally, these should be replaced
with some kind of timed or event wait.
For now we use cond_resched_stall(), which tries to schedule if
possible, and executes a cpu_relax() if not.
All of these are set-1 or set-2. Replace the call in stop_one_cpu()
with cond_resched_stall() to allow it a chance to schedule.
[1] https://lore.kernel.org/lkml/20231107215742.363031-1-ankur.a.arora@oracle.com/
Cc: Tejun Heo <tj@...nel.org>
Cc: Lai Jiangshan <jiangshanlai@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>
Cc: Nicholas Piggin <npiggin@...il.com>
Signed-off-by: Ankur Arora <ankur.a.arora@...cle.com>
---
kernel/kthread.c | 1 -
kernel/softirq.c | 1 -
kernel/stop_machine.c | 2 +-
kernel/workqueue.c | 10 ----------
4 files changed, 1 insertion(+), 13 deletions(-)
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 1eea53050bab..e111eebee240 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -830,7 +830,6 @@ int kthread_worker_fn(void *worker_ptr)
schedule();
try_to_freeze();
- cond_resched();
goto repeat;
}
EXPORT_SYMBOL_GPL(kthread_worker_fn);
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 210cf5f8d92c..c80237cbcb3d 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -920,7 +920,6 @@ static void run_ksoftirqd(unsigned int cpu)
*/
__do_softirq();
ksoftirqd_run_end();
- cond_resched();
return;
}
ksoftirqd_run_end();
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index cedb17ba158a..1929fe8ecd70 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -148,7 +148,7 @@ int stop_one_cpu(unsigned int cpu, cpu_stop_fn_t fn, void *arg)
* In case @cpu == smp_proccessor_id() we can avoid a sleep+wakeup
* cycle by doing a preemption:
*/
- cond_resched();
+ cond_resched_stall();
wait_for_completion(&done.completion);
return done.ret;
}
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index a3522b70218d..be5080e1b7d6 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2646,16 +2646,6 @@ __acquires(&pool->lock)
dump_stack();
}
- /*
- * The following prevents a kworker from hogging CPU on !PREEMPTION
- * kernels, where a requeueing work item waiting for something to
- * happen could deadlock with stop_machine as such work item could
- * indefinitely requeue itself while all other CPUs are trapped in
- * stop_machine. At the same time, report a quiescent RCU state so
- * the same condition doesn't freeze RCU.
- */
- cond_resched();
-
raw_spin_lock_irq(&pool->lock);
/*
--
2.31.1
Powered by blists - more mailing lists