linux-kernel - Re: [PATCH v4 1/2] sched/fair: Check a task has a fitting cpu when updating misfit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240201222129.myunpz3vb425i3dj@airbuntu>
Date: Thu, 1 Feb 2024 22:21:29 +0000
From: Qais Yousef <qyousef@...alina.io>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Ingo Molnar <mingo@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	linux-kernel@...r.kernel.org,
	Pierre Gondois <Pierre.Gondois@....com>
Subject: Re: [PATCH v4 1/2] sched/fair: Check a task has a fitting cpu when
 updating misfit

On 01/31/24 14:55, Vincent Guittot wrote:
> On Wed, 31 Jan 2024 at 00:57, Qais Yousef <qyousef@...alina.io> wrote:
> >
> > On 01/30/24 10:41, Vincent Guittot wrote:
> > > On Mon, 29 Jan 2024 at 00:50, Qais Yousef <qyousef@...alina.io> wrote:
> > > >
> > > > On 01/26/24 15:08, Vincent Guittot wrote:
> > > >
> > > > > > TBH I had a bit of confirmation bias that this is a problem based on the fix
> > > > > > (0ae78eec8aa6) that we had in the past. So on verification I looked at
> > > > > > balance_interval and this reproducer which is a not the same as the original
> > > > > > one and it might be exposing another problem and I didn't think twice about it.
> > > > >
> > > > > I checked the behavior more deeply and I confirm that I don't see
> > > > > improvement for the use case described above. I would say that it's
> > > > > even worse as I can see some runs where the task stays on little
> > > > > whereas a big core has been added in the affinity. Having in mind that
> > > > > my system is pretty idle which means that there is almost no other
> > > > > reason to trigger an ilb than the misfit task, the change in
> > > > > check_misfit_status() is probably the reason for never kicking an ilb
> > > > > for such case
> > > >
> > > > It seems I reproduced another problem while trying to reproduce the original
> > > > issue, eh.
> > > >
> > > > I did dig more and from what I see the issue is that the rd->overload is not
> > > > being set correctly. Which I believe what causes the delays (see attached
> > > > picture how rd.overloaded is 0 with some spikes). Only when CPU7
> > > > newidle_balance() coincided with rd->overload being 1 that the migration
> > > > happens. With the below hack I can see that rd->overload is 1 all the time
> > >
> > > But here you rely on another activity happening in CPU7 whereas the
> >
> > I don't want to rely on that. I think this is a problem too. And this is what
> > ends up happening from what I see, sometimes at least.
> >
> > When is it expected for newidle_balance to pull anyway? I agree we shouldn't
> > rely on it to randomly happen, but if it happens sooner, it should pull, no?
> >
> > > misfit should trigger by itself the load balance and not expect
> > > another task waking up then sleeping on cpu7 to trigger a newidle
> > > balance. We want a normal idle load balance not a newidle_balance
> >
> > I think there's a terminology problems. I thought you mean newidle_balnce() by
> > ilb. It seems you're referring to load_balance() called from
> > rebalance_domains() when tick happens at idle?
> 
> newidle_balance is different from idle load balance. newidle_balance
> happens when the cpu becomes idle whereas busy and idle load balance
> happen at tick.

Yes. newidle_balance() is not supposed to pull a misfit task then?

> 
> >
> > I thought this is not kicking. But I just double checked in my traces and I was
> > getting confused because I was looking at where run_rebalance_domains() would
> > happen, for example, on CPU2 but the balance would actually be for CPU7.
> 
> An idle load balance happens either on the target CPU if its tick is
> not stopped or we kick one idle CPU to run the idle load balance on
> behalf of all idle CPUs. This is the latter case that doesn't happen
> anymore with your patch and the change in check_misfit_status.

Yes. I just got confused while looking at the log. I'm testing without my patch
FWIW. It should be 6.6 from Asahi folks which should contain nothing but the
necessary stuff to make the machine run that wasn't fully upstreamed yet.

> 
> >
> > No clue why it fails to pull.. I can see actually we call load_balance() twice
> > for some (not all) entries to rebalance_domains(). So we don't always operate
> > on the two domains. But that's not necessarily a problem.
> 
> We have 3 different reasons for kicking an idle load balance :
> - to do an actual balance of tasks
> - to update stats ie blocked load
> - to update nohz.next_balance
> 
> You are interested by the 1st one but it's most probably for the 2
> last reasons that this happen

Okay. Thanks for the info. I need to figure out why 1 fails although there's
a misfit task to pull.

> 
> >
> > I think it's a good opportunity to add some tracepoints to help break this path
> > down. If you have suggestions of things to record that'd be helpful.