lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtA-cipUfjxHnDhWSoPT3sAMjW+9tgBpx7Fe-0XYA=tHsg@mail.gmail.com>
Date: Tue, 13 Jan 2026 11:21:55 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Shrikanth Hegde <sshegde@...ux.ibm.com>
Cc: mingo@...nel.org, peterz@...radead.org, linux-kernel@...r.kernel.org, 
	kprateek.nayak@....com, juri.lelli@...hat.com, vschneid@...hat.com, 
	tglx@...nel.org, dietmar.eggemann@....com, anna-maria@...utronix.de, 
	frederic@...nel.org, wangyang.guo@...el.com
Subject: Re: [PATCH v4 1/3] sched/fair: Move checking for nohz cpus after time check

On Tue, 13 Jan 2026 at 10:23, Shrikanth Hegde <sshegde@...ux.ibm.com> wrote:
>
>
>
> On 1/13/26 2:37 PM, Vincent Guittot wrote:
> > On Mon, 12 Jan 2026 at 06:05, Shrikanth Hegde <sshegde@...ux.ibm.com> wrote:
> >>
> >> NOHZ idle load balancer is kicked off only after time check. So move
> >> the atomic read after the time check to access it only when needed.
> >>
> >> When there are no idle CPUs(100% busy), even if the flag gets set to
> >> NOHZ_STATS_KICK | NOHZ_NEXT_KICK, find_new_ilb will fail and
> >> there will be no NOHZ idle balance. The current behaviour is retained.
> >>
> >> Note: This patch doesn't solve any cacheline overheads. No improvement
> >> in performance apart from saving a few cycles of atomic_read.
> >
> > But won't these cycles be then wasted by calling needlessly kick_ilb
> >
>
> when there are nohz cpus, i.e nohz.nr_cpus > 0, there is no change in codeflow.
>
> Only when system is 100%(which is expected to be rare), nohz.nr_cpus == 0,
> then it is expected that has_blocked_load = 0. So flags shouldn't be set.

The way we are setting/clearing has_blocked_load vs
nr_cpus/idle_cpus_mask implies that it's possible to get
has_blocked_load == 1 but nr_cpus == 0 although it's a corner case and
not a default behavior

No CPUs are idle: nr_cpus == 0

CPU 0 enters idle
  - inc nr_cpus and set idle_cpus_mask
  - set nohz.has_blocked

CPU0 wakes up

Tick fires on CPU0
  - dec nr_cpus and clear idle_cpus_mask
  - nohz.has_blocked == 1, most probably now > nohz.next_blocked, if
now < nohz.next_balance, we skip the test of nr_cpus and we call
kick_ilb() but nr_cpus == 0 and idle_cpus_mask is empty

> Note we are still doing a return if nohz.nr_cpus == 0. So kick_ilb shouldn't be
> called.

The return can be skipped by if (time_before(now, nohz.next_balance)) goto out

>
> Do you see any path still calling kick_ilb un-necessarily?

Yes but at the same time it's clearly not the main case


>
>
> >>
> >> Signed-off-by: Shrikanth Hegde <sshegde@...ux.ibm.com>
> >> ---
> >>   kernel/sched/fair.c | 18 +++++++++++-------
> >>   1 file changed, 11 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >> index 9743fc0b225c..17e4e8ac5fca 100644
> >> --- a/kernel/sched/fair.c
> >> +++ b/kernel/sched/fair.c
> >> @@ -12451,20 +12451,24 @@ static void nohz_balancer_kick(struct rq *rq)
> >>           */
> >>          nohz_balance_exit_idle(rq);
> >>
> >> -       /*
> >> -        * None are in tickless mode and hence no need for NOHZ idle load
> >> -        * balancing:
> >> -        */
> >> -       if (likely(!atomic_read(&nohz.nr_cpus)))
> >> -               return;
> >> -
> >>          if (READ_ONCE(nohz.has_blocked_load) &&
> >>              time_after(now, READ_ONCE(nohz.next_blocked)))
> >>                  flags = NOHZ_STATS_KICK;
> >>
> >> +       /*
> >> +        * If none are in tickless mode, though flag maybe set,
> >> +        * idle load balancing is not done as find_new_ilb fails
> >> +        */
> >>          if (time_before(now, nohz.next_balance))
> >>                  goto out;
> >>
> >> +       /*
> >> +        * None are in tickless mode and hence no need for NOHZ idle load
> >> +        * balancing:
> >> +        */
> >> +       if (likely(!atomic_read(&nohz.nr_cpus)))
> >> +               return;
> >> +
> >>          if (rq->nr_running >= 2) {
> >>                  flags = NOHZ_STATS_KICK | NOHZ_BALANCE_KICK;
> >>                  goto out;
> >> --
> >> 2.47.3
> >>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ