[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201117093829.GD3121429@hirez.programming.kicks-ass.net>
Date: Tue, 17 Nov 2020 10:38:29 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Mel Gorman <mgorman@...hsingularity.net>
Cc: Will Deacon <will@...nel.org>, Davidlohr Bueso <dave@...olabs.net>,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
Tejun Heo <tj@...nel.org>
Subject: [PATCH] sched: Fix rq->nr_iowait ordering
And poking at this reminded me of an order email from TJ that seems to
have stagnated.
---
Subject: sched: Fix rq->nr_iowait ordering
From: Peter Zijlstra <peterz@...radead.org>
Date: Thu, 24 Sep 2020 13:50:42 +0200
schedule() ttwu()
deactivate_task(); if (p->on_rq && ...) // false
atomic_dec(&task_rq(p)->nr_iowait);
if (prev->in_iowait)
atomic_inc(&rq->nr_iowait);
Allows nr_iowait to be decremented before it gets incremented,
resulting in more dodgy IO-wait numbers than usual.
Note that because we can now do ttwu_queue_wakelist() before
p->on_cpu==0, we lose the natural ordering and have to further delay
the decrement.
Fixes: Fixes: c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu")
Reported-by: Tejun Heo <tj@...nel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
---
kernel/sched/core.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2949,7 +2949,12 @@ ttwu_do_activate(struct rq *rq, struct t
#ifdef CONFIG_SMP
if (wake_flags & WF_MIGRATED)
en_flags |= ENQUEUE_MIGRATED;
+ else
#endif
+ if (p->in_iowait) {
+ delayacct_blkio_end(p);
+ atomic_dec(&task_rq(p)->nr_iowait);
+ }
activate_task(rq, p, en_flags);
ttwu_do_wakeup(rq, p, wake_flags, rf);
@@ -3336,11 +3341,6 @@ try_to_wake_up(struct task_struct *p, un
if (READ_ONCE(p->on_rq) && ttwu_runnable(p, wake_flags))
goto unlock;
- if (p->in_iowait) {
- delayacct_blkio_end(p);
- atomic_dec(&task_rq(p)->nr_iowait);
- }
-
#ifdef CONFIG_SMP
/*
* Ensure we load p->on_cpu _after_ p->on_rq, otherwise it would be
@@ -3411,6 +3411,11 @@ try_to_wake_up(struct task_struct *p, un
cpu = select_task_rq(p, p->wake_cpu, wake_flags | WF_TTWU);
if (task_cpu(p) != cpu) {
+ if (p->in_iowait) {
+ delayacct_blkio_end(p);
+ atomic_dec(&task_rq(p)->nr_iowait);
+ }
+
wake_flags |= WF_MIGRATED;
psi_ttwu_dequeue(p);
set_task_cpu(p, cpu);
Powered by blists - more mailing lists