[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANDhNCpqjiUi4iKKKXH8PLtUKe2JTJdt8W=7CQ6VJi409+RaEw@mail.gmail.com>
Date: Thu, 4 Jan 2024 19:42:34 -0800
From: John Stultz <jstultz@...gle.com>
To: Metin Kaya <metin.kaya@....com>
Cc: LKML <linux-kernel@...r.kernel.org>, Joel Fernandes <joelaf@...gle.com>,
Qais Yousef <qyousef@...gle.com>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>,
Valentin Schneider <vschneid@...hat.com>, Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Zimuzo Ezeozue <zezeozue@...gle.com>,
Youssef Esmat <youssefesmat@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>, Will Deacon <will@...nel.org>, Waiman Long <longman@...hat.com>,
Boqun Feng <boqun.feng@...il.com>, "Paul E. McKenney" <paulmck@...nel.org>,
Xuewen Yan <xuewen.yan94@...il.com>, K Prateek Nayak <kprateek.nayak@....com>,
Thomas Gleixner <tglx@...utronix.de>, kernel-team@...roid.com
Subject: Re: [PATCH v7 23/23] sched: Fix rt/dl load balancing via chain level balance
On Fri, Dec 22, 2023 at 6:51 AM Metin Kaya <metin.kaya@....com> wrote:
> On 20/12/2023 12:18 am, John Stultz wrote:
> > + /*
> > + * Chain leads off the rq, we're free to push it anywhere.
> > + *
> > + * One wrinkle with relying on find_exec_ctx is that when the chain
> > + * leads to a task currently migrating to rq, we see the chain as
> > + * pushable & push everything prior to the migrating task. Even if
> > + * we checked explicitly for this case, we could still race with a
> > + * migration after the check.
> > + * This shouldn't permanently produce a bad state though, as proxy()
>
> find_proxy_task()
Fixed.
> > +#ifdef CONFIG_SCHED_PROXY_EXEC
> > static inline bool dl_revalidate_rq_state(struct task_struct *task, struct rq *rq,
> > - struct rq *later)
> > + struct rq *later, bool *retry)
> > +{
> > + if (!dl_task(task) || is_migration_disabled(task))
> > + return false;
> > +
> > + if (rq != this_rq()) {
> > + struct task_struct *next_task = pick_next_pushable_dl_task(rq);
> > +
> > + if (next_task == task) {
>
> Nit: We can `return false;` if next_task != task and save one level of
> indentation.
Ah, good point. Fixed.
> > + struct task_struct *exec_ctx;
> > +
> > + exec_ctx = find_exec_ctx(rq, next_task);
> > + *retry = (exec_ctx && !cpumask_test_cpu(later->cpu,
> > + &exec_ctx->cpus_mask));
> > + } else {
> > + return false;
> > + }
> > + } else {
> > + int pushable = task_is_pushable(rq, task, later->cpu);
> > +
> > + *retry = pushable == -1;
> > + if (!pushable)
> > + return false;
>
> `return pushable;` can replace above 2 lines.
> The same for rt_revalidate_rq_state().
Hrm. It does save lines, but I fret (in my estimation) it makes the
code a touch more complex to read. I might hold off on this for the
moment unless someone else pushes for it.
> > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> > index fabb19891e95..d5ce95dc5c09 100644
> > --- a/kernel/sched/rt.c
> > +++ b/kernel/sched/rt.c
...
> > +#ifdef CONFIG_SCHED_PROXY_EXEC
> > +static inline bool rt_revalidate_rq_state(struct task_struct *task, struct rq *rq,
> > + struct rq *lowest, bool *retry)
>
> This function can be consolidated with dl_revalidate_rq_state() as you
> noted in the previous patch, although rt_revalidate_rq_state() has few
> comments.
Yeah. I need to stare at it a bit to try to figure out what might be
the best name to use for the common chunk.
I'd also like to figure a better way to do the retry stuff, as it feels messy.
> > +{
> > + /*
> > + * Releasing the rq lock means we need to re-check pushability.
> > + * Some scenarios:
> > + * 1) If a migration from another CPU sent a task/chain to rq
> > + * that made task newly unpushable by completing a chain
> > + * from task to rq->curr, then we need to bail out and push something
> > + * else.
> > + * 2) If our chain led off this CPU or to a dequeued task, the last waiter
> > + * on this CPU might have acquired the lock and woken (or even migrated
> > + * & run, handed off the lock it held, etc...). This can invalidate the
> > + * result of find_lowest_rq() if our chain previously ended in a blocked
> > + * task whose affinity we could ignore, but now ends in an unblocked
> > + * task that can't run on lowest_rq.
> > + * 3) Race described at https://lore.kernel.org/all/1523536384-26781-2-git-send-email-huawei.libin@huawei.com/
> > + *
> > + * Notes on these:
> > + * - Scenario #2 is properly handled by rerunning find_lowest_rq
> > + * - Scenario #1 requires that we fail
> > + * - Scenario #3 can AFAICT only occur when rq is not this_rq(). And the
> > + * suggested fix is not universally correct now that push_cpu_stop() can
> > + * call this function.
> > + */
> > + if (!rt_task(task) || is_migration_disabled(task)) {
> > + return false;
> > + } else if (rq != this_rq()) {
>
> Nit: `else` can be dropped as in dl_revalidate_rq_state().
Ack. Done.
Thanks again for all the feedback!
-john
Powered by blists - more mailing lists