lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtBo7Kny0r15ik07pMLCjETw7UQo=ypbXww22fMLHzQkgA@mail.gmail.com>
Date: Tue, 15 Oct 2024 14:44:15 +0200
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Pierre Gondois <pierre.gondois@....com>
Cc: linux-kernel@...r.kernel.org, Hongyan Xia <hongyan.xia2@....com>, 
	Chritian Loehle <christian.loehle@....com>, Ingo Molnar <mingo@...hat.com>, 
	Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>, 
	Dietmar Eggemann <dietmar.eggemann@....com>, Steven Rostedt <rostedt@...dmis.org>, 
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, 
	Valentin Schneider <vschneid@...hat.com>
Subject: Re: [PATCH 1/1] sched/fair: Update blocked averages on tick

On Fri, 11 Oct 2024 at 14:32, Pierre Gondois <pierre.gondois@....com> wrote:
>
> The Energy Aware Scheduler (EAS) relies on CPU/tasks utilization
> signals. On an idle CPU, the blocked load is updated during
> load balancing.
>
> sd->balance_interval increases with the number of CPUs in the domain.
> On an Arm DynamIQ system, sched domains containing CPUs with the same
> capacity do not exist. On a Pixel6 with 4 little, 2 mid, 2 big CPUs:
> - sd->min_interval = 8
> - sd->min_interval = 16
>
> The balance interval is doubled if the system is balanced, meaning
> that a balanced system will likely update blocked load every 16ms.

The real max boundary is LOAD_AVG_PERIOD that is used to update
nohz.next_blocked. This is the max between 2 updates of blocked load.
The other ones are opportunistics updates when a normal load balance
is triggered.

>
> The find_energy_efficient_cpu() function might thus relies on outdated
> util signals to place tasks, leading to bad energy placement.

Moving from 8ms to 16 ms is what makes the difference for you ?

The LOAD_AVG_PERIOD mas period has been used as a default value but if
it's too long, we could consider changing the max period between 2
updates

>
> Update blocked load on sched tick if:
> - the rq is idle
> - the load balancer will not be triggered.
>
> Signed-off-by: Pierre Gondois <pierre.gondois@....com>
> ---
>  kernel/sched/fair.c | 21 ++++++++++++++++-----
>  1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 225b31aaee55..2f03bd10ac7a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9841,15 +9841,12 @@ static unsigned long task_h_load(struct task_struct *p)
>  }
>  #endif
>
> -static void sched_balance_update_blocked_averages(int cpu)
> +static void update_blocked_averages(struct rq *rq)
>  {
>         bool decayed = false, done = true;
> -       struct rq *rq = cpu_rq(cpu);
> -       struct rq_flags rf;
>
> -       rq_lock_irqsave(rq, &rf);
> -       update_blocked_load_tick(rq);
>         update_rq_clock(rq);
> +       update_blocked_load_tick(rq);
>
>         decayed |= __update_blocked_others(rq, &done);
>         decayed |= __update_blocked_fair(rq, &done);
> @@ -9857,6 +9854,18 @@ static void sched_balance_update_blocked_averages(int cpu)
>         update_blocked_load_status(rq, !done);
>         if (decayed)
>                 cpufreq_update_util(rq, 0);
> +}
> +
> +static void sched_balance_update_blocked_averages(int cpu)
> +{
> +       struct rq *rq = cpu_rq(cpu);
> +       struct cfs_rq *cfs_rq;
> +       struct rq_flags rf;
> +
> +       cfs_rq = &rq->cfs;
> +
> +       rq_lock_irqsave(rq, &rf);
> +       update_blocked_averages(rq);
>         rq_unlock_irqrestore(rq, &rf);
>  }
>
> @@ -12877,6 +12886,8 @@ void sched_balance_trigger(struct rq *rq)
>
>         if (time_after_eq(jiffies, rq->next_balance))
>                 raise_softirq(SCHED_SOFTIRQ);
> +       else if (idle_cpu(rq->cpu))
> +               update_blocked_averages(rq);

would be good to explain why you don't need rq lock here

There is no rate limit so we can do this every tick (possibly  1ms)
when staying in shallowest state

So it's looks better to update the period between 2 update of blocked
load instead of adding a new path

>
>         nohz_balancer_kick(rq);
>  }
> --
> 2.25.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ