lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 27 Aug 2015 13:21:16 +0200
From:	Vincent Guittot <vincent.guittot@...aro.org>
To:	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Preeti U Murthy <preeti@...ux.vnet.ibm.com>,
	Jason Low <jason.low2@...com>
Cc:	Linaro Kernel Mailman List <linaro-kernel@...ts.linaro.org>,
	Vincent Guittot <vincent.guittot@...aro.org>
Subject: Re: [PATCH v2] sched: fix nohz.next_balance update

Hi,

On 3 August 2015 at 11:55, Vincent Guittot <vincent.guittot@...aro.org> wrote:
> Since commit d4573c3e1c99 ("sched: Improve load balancing in the presence
> of idle CPUs"), the ILB CPU starts with the idle load balancing of other
> idle CPUs and finishes with itself in order to speed up the spread of tasks
> in all idle CPUs.
>
> The this_rq->next_balance is still used in nohz_idle_balance as an
> intermediate step to gather the shortest next balance before updating
> nohz.next_balance. But the former has not been updated yet and is likely to
> be set with the current jiffies. As a result, the nohz.next_balance will be
> set with current jiffies instead of the real next balance date. This
> generates spurious kicks of nohz ilde balance.
>
> nohz_idle_balance must set the nohz.next_balance without taking into
> account this_rq->next_balance which is not updated yet. Then, this_rq will
> update nohz.next_update with its next_balance once updated and if necessary.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
> ---
>
> change since v1:
> - add #ifdef CONFIG_NO_HZ_COMMON for accessing nohz structure
> - fix some typos
>
>  kernel/sched/fair.c | 35 +++++++++++++++++++++++++++++++----
>  1 file changed, 31 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 587a2f6..581378a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7779,8 +7779,23 @@ static void rebalance_domains(struct rq *rq, enum cpu_idle_type idle)
>          * When the cpu is attached to null domain for ex, it will not be
>          * updated.
>          */
> -       if (likely(update_next_balance))
> +       if (likely(update_next_balance)) {
>                 rq->next_balance = next_balance;
> +
> +#ifdef CONFIG_NO_HZ_COMMON
> +               /*
> +                * If this cpu has been elected to perform the nohz idle
> +                * balance. Other idle cpus have already rebalanced with
> +                * nohz_idle_balance and the nohz.next_balance has been
> +                * updated accordingly. This cpu is now running the idle load
> +                * balance for itself and we need to update the
> +                * nohz.next_balance accordingly.
> +                */
> +               if ((idle == CPU_IDLE) &&
> +                       time_after(nohz.next_balance, rq->next_balance))
> +                               nohz.next_balance = rq->next_balance;
> +#endif
> +       }
>  }
>
>  #ifdef CONFIG_NO_HZ_COMMON
> @@ -7793,6 +7808,9 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle)
>         int this_cpu = this_rq->cpu;
>         struct rq *rq;
>         int balance_cpu;
> +       /* Earliest time when we have to do rebalance again */
> +       unsigned long next_balance = jiffies + 60*HZ;
> +       int update_next_balance = 0;
>
>         if (idle != CPU_IDLE ||
>             !test_bit(NOHZ_BALANCE_KICK, nohz_flags(this_cpu)))
> @@ -7824,10 +7842,19 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle)
>                         rebalance_domains(rq, CPU_IDLE);
>                 }
>
> -               if (time_after(this_rq->next_balance, rq->next_balance))
> -                       this_rq->next_balance = rq->next_balance;
> +               if (time_after(next_balance, rq->next_balance)) {
> +                       next_balance = rq->next_balance;
> +                       update_next_balance = 1;
> +               }
>         }
> -       nohz.next_balance = this_rq->next_balance;
> +
> +       /*
> +        * next_balance will be updated only when there is a need.
> +        * When the cpu is attached to null domain for ex, it will not be
> +        * updated.
> +        */
> +       if (likely(update_next_balance))
> +               nohz.next_balance = next_balance;
>  end:
>         clear_bit(NOHZ_BALANCE_KICK, nohz_flags(this_cpu));
>  }
> --
> 1.9.1
>

Gentle ping

Regards,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ