linux-kernel - Re: [PATCH v4 1/2] sched/fair: Check a task has a fitting cpu when updating misfit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240124222959.ikwnbxkcjaxuiqp2@airbuntu>
Date: Wed, 24 Jan 2024 22:29:59 +0000
From: Qais Yousef <qyousef@...alina.io>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Ingo Molnar <mingo@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	linux-kernel@...r.kernel.org,
	Pierre Gondois <Pierre.Gondois@....com>
Subject: Re: [PATCH v4 1/2] sched/fair: Check a task has a fitting cpu when
 updating misfit

On 01/23/24 09:26, Vincent Guittot wrote:
> On Fri, 5 Jan 2024 at 23:20, Qais Yousef <qyousef@...alina.io> wrote:
> >
> > From: Qais Yousef <qais.yousef@....com>
> >
> > If a misfit task is affined to a subset of the possible cpus, we need to
> > verify that one of these cpus can fit it. Otherwise the load balancer
> > code will continuously trigger needlessly leading the balance_interval
> > to increase in return and eventually end up with a situation where real
> > imbalances take a long time to address because of this impossible
> > imbalance situation.
> 
> If your problem is about increasing balance_interval, it would be
> better to not increase the interval is such case.
> I mean that we are able to detect misfit_task conditions for the
> periodic load balance so we should be able to not increase the
> interval in such cases.
> 
> If I'm not wrong, your problem only happens when the system is
> overutilized and we have disable EAS

Yes and no. There are two concerns here:

1.

So this patch is a generalized form of 0ae78eec8aa6 ("sched/eas: Don't update
misfit status if the task is pinned") which is when I originally noticed the
problem and this patch was written along side it.

We have unlinked misfit from overutilized since then.

And to be honest I am not sure if flattening of topology matters too since
I first noticed this, which was on Juno which doesn't have flat topology.

FWIW I can still reproduce this, but I have a different setup now. On M1 mac
mini if I spawn a busy task affined to littles then expand the mask for
a single big core; I see big delays (>500ms) without the patch. But with the
patch it moves in few ms. The delay without the patch is too large and I can't
explain it. So the worry here is that generally misfit migration not happening
fast enough due to this fake misfit cases.

I did hit issues where with this patch I saw big delays sometimes. I have no
clue why this happens. So there are potentially more problems to chase.

My expectations that newidle balance should be able to pull misfit regardless
of balance_interval. So the system has to be really busy or really quite to
notice delays. I think prior to flat topology this pull was not guaranteed, but
with flat topology it should happen.

On this system if I expand the mask to all cpus (instead of littles + single
big), the issue is not as easy to reproduce, but I captured 35+ms delays
- which is long if this task was carrying important work and needs to
upmigrate. I thought newidle balance is more likely to pull it sooner, but I am
not 100% sure.

It's a 6.6 kernel I am testing with.

2.

Here yes the concern is that when we are overutilized and load balance is
required, this unnecessarily long delay can cause potential problems.

Cheers

--
Qais Yousef