lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 22 Feb 2018 10:04:56 +0000
From:   Valentin Schneider <valentin.schneider@....com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Morten Rasmussen <morten.rasmussen@...s.arm.com>,
        Brendan Jackman <brendan.jackman@....com>,
        Dietmar Eggemann <dietmar.eggemann@....com>
Subject: Re: [PATCH v5 1/3] sched: Stop nohz stats when decayed

On 02/22/2018 08:37 AM, Vincent Guittot wrote:
> On 21 February 2018 at 14:13, Valentin Schneider
> <valentin.schneider@....com> wrote:
>> On 02/16/2018 01:44 PM, Vincent Guittot wrote:
>>> On 16 February 2018 at 13:13, Valentin Schneider
>>> <valentin.schneider@....com> wrote:
>>>> On 02/14/2018 03:26 PM, Vincent Guittot wrote:
>>>>> Stopped the periodic update of blocked load when all idle CPUs have fully
>>>>> decayed. We introduce a new nohz.has_blocked that reflect if some idle
>>>>> CPUs has blocked load that have to be periodiccally updated. nohz.has_blocked
>>>>> is set everytime that a Idle CPU can have blocked load and it is then clear
>>>>> when no more blocked load has been detected during an update. We don't need
>>>>> atomic operation but only to make cure of the right ordering when updating
>>>>> nohz.idle_cpus_mask and nohz.has_blocked.
>>>>>
>>>>> Suggested-by: Peter Zijlstra (Intel) <peterz@...radead.org>
>>>>> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
>>>>> ---
>>>>>  kernel/sched/fair.c  | 122 ++++++++++++++++++++++++++++++++++++++++++---------
>>>>>  kernel/sched/sched.h |   1 +
>>>>>  2 files changed, 102 insertions(+), 21 deletions(-)
>>>>>
>>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>>>> index 7af1fa9..5a6835e 100644
>>>>> --- a/kernel/sched/fair.c
>>>>> +++ b/kernel/sched/fair.c
>>>>>
>>>>> [...]
>>
>> I have one more question on that bit:
>>
>>
>>                 has_blocked_load |= update_nohz_stats(rq, true);
>>
>>                 /*
>>                  * If time for next balance is due,
>>                  * do the balance.
>>                  */
>>                 if (time_after_eq(jiffies, rq->next_balance)) {
>>                         struct rq_flags rf;
>>
>>                         rq_lock_irqsave(rq, &rf);
>>                         update_rq_clock(rq);
>>                         cpu_load_update_idle(rq);
>>                         rq_unlock_irqrestore(rq, &rf);
>>
>>                         if (flags & NOHZ_BALANCE_KICK)
>>                                 rebalance_domains(rq, CPU_IDLE);
>>                 }
>>
>>                 if (time_after(next_balance, rq->next_balance)) {
>>                         next_balance = rq->next_balance;
>>                         update_next_balance = 1;
>>                 }
>>
>>
>> Now that I think about it, shouldn't we always have a 'continue' after
>> the blocked load update if (flags & NOHZ_KICK_MASK) == NOHZ_STATS_KICK ?
>> AFAICT we don't want to push the next_balance forward, only the next_blocked.
> 
> But we don't push next_balance forward. It just get the shortest
> next_balance and update nohz.next_balance exactly like what is done in
> full idle load balance
> 

Sorry, that was a poor choice of words - I probably should've just gone with
"update". What I meant by that is that if we have
    (flags & NOHZ_KICK_MASK) == NOHZ_STATS_KICK
then we're not going to do the load balance.

Then, in this case, I thought that we should not be going through any
condition that uses nohz.next_balance (since we're not doing any balancing).
Arguably *updating* nohz.next_balance still makes sense in this scenario.

In short, my comment was mostly about "cleanly" separating stats update vs
load balance.

>> That would also take care of not doing the load balance.
>>>>
>>>>                 /*
>>>>                  * This cpu doesn't have any remaining blocked load, skip it.
>>>>                  * It's sane to do this because this flag is raised in
>>>>                  * nohz_balance_enter_idle()
>>>>                  */
>>>>                 if ((flags & NOHZ_KICK_MASK) == NOHZ_STATS_KICK &&
>>>>                     !rq->has_blocked_load)
>>>>                         continue;
> 
> Then, it's worth keeping the call to cpu_load_update_idle(rq); which
> update the cpu_load[] array which is still used at some level
> 

Is that something we would want to have in update_nohz_stats() to also
cover the idle_balance -> load_balance update scenario ?
>From a quick glance I would've said it shouldn't be needed since the CPU doing
the updates wouldn't have been nohz previously, but we're currently calling
it when going through nohz_newidle_balance() so I might have gotten that wrong.

>>>>
>>>>> +             update_blocked_averages(rq->cpu);
>>>>> +             has_blocked_load |= rq->has_blocked_load;
>>>>> +

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ