lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240801122115.lfxvc3dxa6b6eesl@airbuntu>
Date: Thu, 1 Aug 2024 13:21:15 +0100
From: Qais Yousef <qyousef@...alina.io>
To: Xuewen Yan <xuewen.yan94@...il.com>
Cc: Ingo Molnar <mingo@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	linux-kernel@...r.kernel.org, Lukasz Luba <lukasz.luba@....com>,
	Wei Wang <wvw@...gle.com>, Rick Yiu <rickyiu@...gle.com>,
	Chung-Kai Mei <chungkai@...gle.com>,
	Xuewen Yan <xuewen.yan@...soc.com>,
	John Stultz <jstultz@...gle.com>
Subject: Re: [PATCH 2/3] sched/fair: Generalize misfit lb by adding a misfit
 reason

On 07/29/24 18:47, Xuewen Yan wrote:
> Hi Qais
> 
> On Thu, Jul 25, 2024 at 5:35 AM Qais Yousef <qyousef@...alina.io> wrote:
> >
> > Hi Xuewen
> >
> > On 07/17/24 16:26, Xuewen Yan wrote:
> > > Hi Qais
> > >
> > > On Sat, Dec 9, 2023 at 9:19 AM Qais Yousef <qyousef@...alina.io> wrote:
> >
> > > > @@ -11008,6 +11025,7 @@ static struct rq *find_busiest_queue(struct lb_env *env,
> > > >                  * average load.
> > > >                  */
> > > >                 if (env->sd->flags & SD_ASYM_CPUCAPACITY &&
> > > > +                   rq->misfit_reason == MISFIT_PERF &&
> > >
> > > In Android, I found this would cause a task loop to change the CPUs.
> > > Maybe this should be removed. Because for the same capacity cpus, we
> > > should skip this cpu when nr_running=1.
> >
> > Could you explain a bit more? Are you saying this is changing the behavior for
> > some use case? The check will ensure this path is only triggered for misfit
> > upmigration. Which AFAICT the only reason why this path was added.
> >
> > The problem is that to implement another misfit reason, the check for
> > capacity_greater() is not true except for MISFIT_PERF. For MISFIT_POWER, we
> > want the CPU to be smaller.
> 
> Sorry, it was my mistake.

Np, it's always good to hear back in case there's a problem :)

> After debugging, I found that there was a problem with my handling of
> MISFIT_PERF.
> But it is true that due to the influence of rt and irq load,
> capacity_greater() sometimes does cause some confusion.
> Sometimes we find that due to the different capacities between small
> cores, a misfit task will migrate several times between small cores,
> for example:
> If capacity_cpu3 > capacity_cpu2 > capacity_cpu1 >capacity_cpu0,
> the misfit task may migrate as follows: cpu0->cpu1->cpu2->cpu3.
> I don't know if this migration is really necessary, but it does cause
> me some confusion.

It should be cheap in theory.

But have you verified that the load_balance type is misfit and not load balance
trying to distribute load on little cores? I think it is harmless if it is
caused by misfit, but yes looks unnecessary to me too.

I'd love to remove this 5% magic margin, but I have no idea how yet.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ