linux-kernel - Re: [PATCH v4 1/2] sched/fair: Check a task has a fitting cpu when updating misfit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKfTPtC0=MH7bCypeY1QFxt=pFbPxY9YLuuS8_dhkF31nR6ZWQ@mail.gmail.com>
Date: Tue, 30 Jan 2024 10:41:50 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Qais Yousef <qyousef@...alina.io>
Cc: Ingo Molnar <mingo@...nel.org>, Peter Zijlstra <peterz@...radead.org>, 
	Dietmar Eggemann <dietmar.eggemann@....com>, linux-kernel@...r.kernel.org, 
	Pierre Gondois <Pierre.Gondois@....com>
Subject: Re: [PATCH v4 1/2] sched/fair: Check a task has a fitting cpu when
 updating misfit

On Mon, 29 Jan 2024 at 00:50, Qais Yousef <qyousef@...alina.io> wrote:
>
> On 01/26/24 15:08, Vincent Guittot wrote:
>
> > > TBH I had a bit of confirmation bias that this is a problem based on the fix
> > > (0ae78eec8aa6) that we had in the past. So on verification I looked at
> > > balance_interval and this reproducer which is a not the same as the original
> > > one and it might be exposing another problem and I didn't think twice about it.
> >
> > I checked the behavior more deeply and I confirm that I don't see
> > improvement for the use case described above. I would say that it's
> > even worse as I can see some runs where the task stays on little
> > whereas a big core has been added in the affinity. Having in mind that
> > my system is pretty idle which means that there is almost no other
> > reason to trigger an ilb than the misfit task, the change in
> > check_misfit_status() is probably the reason for never kicking an ilb
> > for such case
>
> It seems I reproduced another problem while trying to reproduce the original
> issue, eh.
>
> I did dig more and from what I see the issue is that the rd->overload is not
> being set correctly. Which I believe what causes the delays (see attached
> picture how rd.overloaded is 0 with some spikes). Only when CPU7
> newidle_balance() coincided with rd->overload being 1 that the migration
> happens. With the below hack I can see that rd->overload is 1 all the time

But here you rely on another activity happening in CPU7 whereas the
misfit should trigger by itself the load balance and not expect
another task waking up then sleeping on cpu7 to trigger a newidle
balance. We want a normal idle load balance not a newidle_balance

> (even after the move as we still trigger a misfit on the big CPU). With my
> patch only rd->overload is set to 1 (because of this task) only for a short
> period after we change affinity.
>
>         diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>         index df348aa55d3c..86069fe527f9 100644
>         --- a/kernel/sched/fair.c
>         +++ b/kernel/sched/fair.c
>         @@ -9707,8 +9707,8 @@ static inline void update_sg_lb_stats(struct lb_env *env,
>                                 continue;
>                         }
>
>         -               if (local_group)
>         -                       continue;
>         +               /* if (local_group) */
>         +                       /* continue; */
>
>                         if (env->sd->flags & SD_ASYM_CPUCAPACITY) {
>                                 /* Check for a misfit task on the cpu */
>
> I am not sure what the right fix is, but it seems this condition is required
> for the 2nd leg of this if condition when we compare with load? I don't think
> we should skip the misfit check.
>
>
> Thanks
>
> --
> Qais Yousef