[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240801122115.lfxvc3dxa6b6eesl@airbuntu>
Date: Thu, 1 Aug 2024 13:21:15 +0100
From: Qais Yousef <qyousef@...alina.io>
To: Xuewen Yan <xuewen.yan94@...il.com>
Cc: Ingo Molnar <mingo@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
linux-kernel@...r.kernel.org, Lukasz Luba <lukasz.luba@....com>,
Wei Wang <wvw@...gle.com>, Rick Yiu <rickyiu@...gle.com>,
Chung-Kai Mei <chungkai@...gle.com>,
Xuewen Yan <xuewen.yan@...soc.com>,
John Stultz <jstultz@...gle.com>
Subject: Re: [PATCH 2/3] sched/fair: Generalize misfit lb by adding a misfit
reason
On 07/29/24 18:47, Xuewen Yan wrote:
> Hi Qais
>
> On Thu, Jul 25, 2024 at 5:35 AM Qais Yousef <qyousef@...alina.io> wrote:
> >
> > Hi Xuewen
> >
> > On 07/17/24 16:26, Xuewen Yan wrote:
> > > Hi Qais
> > >
> > > On Sat, Dec 9, 2023 at 9:19 AM Qais Yousef <qyousef@...alina.io> wrote:
> >
> > > > @@ -11008,6 +11025,7 @@ static struct rq *find_busiest_queue(struct lb_env *env,
> > > > * average load.
> > > > */
> > > > if (env->sd->flags & SD_ASYM_CPUCAPACITY &&
> > > > + rq->misfit_reason == MISFIT_PERF &&
> > >
> > > In Android, I found this would cause a task loop to change the CPUs.
> > > Maybe this should be removed. Because for the same capacity cpus, we
> > > should skip this cpu when nr_running=1.
> >
> > Could you explain a bit more? Are you saying this is changing the behavior for
> > some use case? The check will ensure this path is only triggered for misfit
> > upmigration. Which AFAICT the only reason why this path was added.
> >
> > The problem is that to implement another misfit reason, the check for
> > capacity_greater() is not true except for MISFIT_PERF. For MISFIT_POWER, we
> > want the CPU to be smaller.
>
> Sorry, it was my mistake.
Np, it's always good to hear back in case there's a problem :)
> After debugging, I found that there was a problem with my handling of
> MISFIT_PERF.
> But it is true that due to the influence of rt and irq load,
> capacity_greater() sometimes does cause some confusion.
> Sometimes we find that due to the different capacities between small
> cores, a misfit task will migrate several times between small cores,
> for example:
> If capacity_cpu3 > capacity_cpu2 > capacity_cpu1 >capacity_cpu0,
> the misfit task may migrate as follows: cpu0->cpu1->cpu2->cpu3.
> I don't know if this migration is really necessary, but it does cause
> me some confusion.
It should be cheap in theory.
But have you verified that the load_balance type is misfit and not load balance
trying to distribute load on little cores? I think it is harmless if it is
caused by misfit, but yes looks unnecessary to me too.
I'd love to remove this 5% magic margin, but I have no idea how yet.
Powered by blists - more mailing lists