lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dbe46747-355d-a0b1-7794-8f511ca54c88@arm.com>
Date:   Fri, 16 Feb 2018 19:23:45 +0000
From:   Valentin Schneider <valentin.schneider@....com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Morten Rasmussen <morten.rasmussen@...s.arm.com>,
        Brendan Jackman <brendan.jackman@....com>,
        Dietmar Eggemann <dietmar.eggemann@....com>
Subject: Re: [PATCH v5 1/3] sched: Stop nohz stats when decayed

On 02/16/2018 05:02 PM, Vincent Guittot wrote:
> On 16 February 2018 at 13:53, Valentin Schneider
> <valentin.schneider@....com> wrote:
>> On 02/14/2018 03:26 PM, Vincent Guittot wrote:
>>> Stopped the periodic update of blocked load when all idle CPUs have fully
>>> decayed. We introduce a new nohz.has_blocked that reflect if some idle
>>> CPUs has blocked load that have to be periodiccally updated. nohz.has_blocked
>>> is set everytime that a Idle CPU can have blocked load and it is then clear
>>> when no more blocked load has been detected during an update. We don't need
>>> atomic operation but only to make cure of the right ordering when updating
>>> nohz.idle_cpus_mask and nohz.has_blocked.
>>>
>>> Suggested-by: Peter Zijlstra (Intel) <peterz@...radead.org>
>>> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
>>> ---
>>>  kernel/sched/fair.c  | 122 ++++++++++++++++++++++++++++++++++++++++++---------
>>>  kernel/sched/sched.h |   1 +
>>>  2 files changed, 102 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 7af1fa9..5a6835e 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>>
>>> [...]
>>>
>>> -static void update_nohz_stats(struct rq *rq)
>>> +static bool update_nohz_stats(struct rq *rq)
>>>  {
>>>  #ifdef CONFIG_NO_HZ_COMMON
>>>       unsigned int cpu = rq->cpu;
>>>
>>> +     if (!rq->has_blocked_load)
>>> +             return false;
>>> +
>>>       if (!cpumask_test_cpu(cpu, nohz.idle_cpus_mask))
>>> -             return;
>>> +             return false;
>>>
>>>       if (!time_after(jiffies, rq->last_blocked_load_update_tick))
>>> -             return;
>>> +             return true;
>>>
>>>       update_blocked_averages(cpu);
>>> +
>>> +     return rq->has_blocked_load;
>>> +#else
>>> +     return false;
>>>  #endif
>>>  }
>>>
>>
>> (Wrongly thought that this bit was in a different patch, comment should have
>> been squashed in previous reply...)
>>
>> I've been thinking about this, and it's a messy one if we want to
>> skip CPUs in idle_balance() / clear the nohz.has_blocked_flag.
>>
>> AFAICT, the rq->has_blocked_load flag can be wrongly cleared: if we're
>> calling update_nohz_stats() for CPU A, but CPU A got out/in of
>> idle really quickly in that same timeframe, I'm not sure you can guarantee
>> the clearing of rq->has_blocked_load done in update_blocked_averages() will
>> always end up in memory before the setting of the flag in
>> nohz_balance_enter_idle().
> 
> Not sure it's a problem in this case because the clear done in
> update_blocked_averages() only happens if there is no load on the rq
> and new load can't be added in the mean time
> 

You're right, and that's why there's that comment:
>>         /*
>>          * Can be set safely without rq->lock held
>>          * If a clear happens, it will have evaluated last additions because
>>          * rq->lock is held during the check and the clear
>>          */
>>         rq->has_blocked_load = 1;

Even though it's clearly written there my brain wouldn't process the fact
that the flag is cleared with the rq lock held. So yeah, we can't wrongly
clear rq->has_blocked_load. The only mishap that can happen is that it is
re-raised even though we just went though an update_nohz_stats(), which would
lead to a useless stats update in the future, but that's not as bad.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ