[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201006143704.GJ4352@localhost.localdomain>
Date: Tue, 6 Oct 2020 16:37:04 +0200
From: Juri Lelli <juri.lelli@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Valentin Schneider <valentin.schneider@....com>,
tglx@...utronix.de, mingo@...nel.org, linux-kernel@...r.kernel.org,
bigeasy@...utronix.de, qais.yousef@....com, swood@...hat.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
bristot@...hat.com, vincent.donnefort@....com, tj@...nel.org
Subject: Re: [PATCH -v2 15/17] sched: Fix migrate_disable() vs rt/dl balancing
On 06/10/20 15:48, Peter Zijlstra wrote:
> On Tue, Oct 06, 2020 at 12:20:43PM +0100, Valentin Schneider wrote:
> >
> > On 05/10/20 15:57, Peter Zijlstra wrote:
> > > In order to minimize the interference of migrate_disable() on lower
> > > priority tasks, which can be deprived of runtime due to being stuck
> > > below a higher priority task. Teach the RT/DL balancers to push away
> > > these higher priority tasks when a lower priority task gets selected
> > > to run on a freshly demoted CPU (pull).
Still digesting the whole lot, but can't we "simply" force push the
higest prio (that we preempt to make space for the migrate_disabled
lower prio) directly to the cpu that would accept the lower prio that
cannot move?
Asking because AFAIU we are calling find_lock_rq from push_cpu_stop and
that selects the best cpu for the high prio. I'm basically wondering if
we could avoid moving, potentially multiple, high prio tasks around to
make space for a lower prio task.
> > > This adds migration interference to the higher priority task, but
> > > restores bandwidth to system that would otherwise be irrevocably lost.
> > > Without this it would be possible to have all tasks on the system
> > > stuck on a single CPU, each task preempted in a migrate_disable()
> > > section with a single high priority task running.
> > >
> > > This way we can still approximate running the M highest priority tasks
> > > on the system.
> > >
> >
> > Ah, so IIUC that's the important bit that makes it we can't just say go
> > through the pushable_tasks list and skip migrate_disable() tasks.
> >
> > Once the highest-prio task exits its migrate_disable() region, your patch
> > pushes it away. If we ended up with a single busy CPU, it'll spread the
> > tasks around one migrate_enable() at a time.
> >
> > That time where the top task is migrate_disable() is still a crappy time,
> > and as you pointed out earlier today if it is a genuine pcpu task then the
> > whole thing is -EBORKED...
> >
> > An alternative I could see would be to prevent those piles from forming
> > altogether, say by issuing a similar push_cpu_stop() on migrate_disable()
> > if the next pushable task is already migrate_disable(); but that's a
> > proactive approach whereas yours is reactive, so I'm pretty sure that's
> > bound to perform worse.
>
> I think it is always possible to form pileups. Just start enough tasks
> such that newer, higher priority, tasks have to preempt existing tasks.
>
> Also, we might not be able to place the task elsewhere, suppose we have
> all our M CPUs filled with an RT task, then when the lowest priority
> task has migrate_disable(), wake the highest priority task.
>
> Per the SMP invariant, this new highest priority task must preempt the
> lowest priority task currently running, otherwise we would not be
> running the M highest prio tasks.
>
> That's not to say it might not still be beneficial from trying to avoid
> them, but we must assume a pilup will occur, therefore my focus was on
> dealing with them as best we can first.
>
Powered by blists - more mailing lists