[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtDxqcrf0kaBQG_zpFx-DEZTMKfyxBu_bzCuZ_UZhJwOnA@mail.gmail.com>
Date: Thu, 25 Jan 2024 18:40:25 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Qais Yousef <qyousef@...alina.io>
Cc: Ingo Molnar <mingo@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
Dietmar Eggemann <dietmar.eggemann@....com>, linux-kernel@...r.kernel.org,
Pierre Gondois <Pierre.Gondois@....com>
Subject: Re: [PATCH v4 1/2] sched/fair: Check a task has a fitting cpu when
updating misfit
On Wed, 24 Jan 2024 at 23:30, Qais Yousef <qyousef@...alina.io> wrote:
>
> On 01/23/24 09:26, Vincent Guittot wrote:
> > On Fri, 5 Jan 2024 at 23:20, Qais Yousef <qyousef@...alina.io> wrote:
> > >
> > > From: Qais Yousef <qais.yousef@....com>
> > >
> > > If a misfit task is affined to a subset of the possible cpus, we need to
> > > verify that one of these cpus can fit it. Otherwise the load balancer
> > > code will continuously trigger needlessly leading the balance_interval
> > > to increase in return and eventually end up with a situation where real
> > > imbalances take a long time to address because of this impossible
> > > imbalance situation.
> >
> > If your problem is about increasing balance_interval, it would be
> > better to not increase the interval is such case.
> > I mean that we are able to detect misfit_task conditions for the
> > periodic load balance so we should be able to not increase the
> > interval in such cases.
> >
> > If I'm not wrong, your problem only happens when the system is
> > overutilized and we have disable EAS
>
> Yes and no. There are two concerns here:
>
> 1.
>
> So this patch is a generalized form of 0ae78eec8aa6 ("sched/eas: Don't update
> misfit status if the task is pinned") which is when I originally noticed the
> problem and this patch was written along side it.
>
> We have unlinked misfit from overutilized since then.
>
> And to be honest I am not sure if flattening of topology matters too since
> I first noticed this, which was on Juno which doesn't have flat topology.
>
> FWIW I can still reproduce this, but I have a different setup now. On M1 mac
> mini if I spawn a busy task affined to littles then expand the mask for
> a single big core; I see big delays (>500ms) without the patch. But with the
> patch it moves in few ms. The delay without the patch is too large and I can't
> explain it. So the worry here is that generally misfit migration not happening
> fast enough due to this fake misfit cases.
I tried a similar scenario on RB5 but I don't see any difference with
your patch. And that could be me not testing it correctly...
I set the affinity of always running task to cpu[0-3] for a few
seconds then extend it to [0-3,7] and the time to migrate is almost
the same.
I'm using tip/sched/core + [0]
[0] https://lore.kernel.org/all/20240108134843.429769-1-vincent.guittot@linaro.org/
>
> I did hit issues where with this patch I saw big delays sometimes. I have no
> clue why this happens. So there are potentially more problems to chase.
>
> My expectations that newidle balance should be able to pull misfit regardless
> of balance_interval. So the system has to be really busy or really quite to
> notice delays. I think prior to flat topology this pull was not guaranteed, but
> with flat topology it should happen.
>
> On this system if I expand the mask to all cpus (instead of littles + single
> big), the issue is not as easy to reproduce, but I captured 35+ms delays
> - which is long if this task was carrying important work and needs to
> upmigrate. I thought newidle balance is more likely to pull it sooner, but I am
> not 100% sure.
>
> It's a 6.6 kernel I am testing with.
>
> 2.
>
> Here yes the concern is that when we are overutilized and load balance is
> required, this unnecessarily long delay can cause potential problems.
>
>
> Cheers
>
> --
> Qais Yousef
Powered by blists - more mailing lists