linux-kernel - Re: [PATCH v5 1/3] sched: Stop nohz stats when decayed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtAM1z86FBqXOoEu-p2cZonUATWFq1vbZ=6pPHn9h5zM-g@mail.gmail.com>
Date:   Thu, 22 Feb 2018 09:37:34 +0100
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Valentin Schneider <valentin.schneider@....com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Morten Rasmussen <morten.rasmussen@...s.arm.com>,
        Brendan Jackman <brendan.jackman@....com>,
        Dietmar Eggemann <dietmar.eggemann@....com>
Subject: Re: [PATCH v5 1/3] sched: Stop nohz stats when decayed

On 21 February 2018 at 14:13, Valentin Schneider
<valentin.schneider@....com> wrote:
> On 02/16/2018 01:44 PM, Vincent Guittot wrote:
>> On 16 February 2018 at 13:13, Valentin Schneider
>> <valentin.schneider@....com> wrote:
>>> On 02/14/2018 03:26 PM, Vincent Guittot wrote:
>>>> Stopped the periodic update of blocked load when all idle CPUs have fully
>>>> decayed. We introduce a new nohz.has_blocked that reflect if some idle
>>>> CPUs has blocked load that have to be periodiccally updated. nohz.has_blocked
>>>> is set everytime that a Idle CPU can have blocked load and it is then clear
>>>> when no more blocked load has been detected during an update. We don't need
>>>> atomic operation but only to make cure of the right ordering when updating
>>>> nohz.idle_cpus_mask and nohz.has_blocked.
>>>>
>>>> Suggested-by: Peter Zijlstra (Intel) <peterz@...radead.org>
>>>> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
>>>> ---
>>>>  kernel/sched/fair.c  | 122 ++++++++++++++++++++++++++++++++++++++++++---------
>>>>  kernel/sched/sched.h |   1 +
>>>>  2 files changed, 102 insertions(+), 21 deletions(-)
>>>>
>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>>> index 7af1fa9..5a6835e 100644
>>>> --- a/kernel/sched/fair.c
>>>> +++ b/kernel/sched/fair.c
>>>>
>>>> [...]
>>>> @@ -9383,11 +9452,16 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle)
>>>>                * work being done for other cpus. Next load
>>>>                * balancing owner will pick it up.
>>>>                */
>>>> -             if (need_resched())
>>>> -                     break;
>>>> +             if (need_resched()) {
>>>> +                     has_blocked_load = true;
>>>> +                     goto abort;
>>>> +             }
>>>>
>>>>               rq = cpu_rq(balance_cpu);
>>>>
>>>
>>> I'd say it's safe to do the following here. The flag is raised in
>>> nohz_balance_enter_idle() before the smp_mb(), so we won't skip a CPU
>>> that just got added to nohz.idle_cpus_mask.
>>
>> rq->has_blocked_load will be set before the barrier only if
>> nohz_tick_stopped is not already set,
>> Otherwise, we skip cpumask update and the barrier in  nohz_balance_enter_idle
>>
>
> Right, forgot about that bit. I think it's still fine because:
> - nohz_balance_enter_idle() can't be called before the last running task is
> dequeued, which requires the rq lock.
> - update_blocked_averages takes the rq lock and clears rq->has_blocked_load
> with the lock held.
>
> So even though we could have some very unlikely scenario where a CPU quickly
> goes out/in of idle after nohz.idle_cpus_mask has been read, the blocked load
> itself is safe so rq->has_blocked_load will end up being set correctly.
> (Took me a while to see it that way)
>
>
> BTW, with the current set on Peter's sched/testing, update_nohz_stats()
> is called here, which doesn't do the update if !rq->has_blocked_load
> (Although that check is done without lock/barrier, so maybe we could not see
> a CPU that just went idle ?)
>
> I have one more question on that bit:
>
>
>                 has_blocked_load |= update_nohz_stats(rq, true);
>
>                 /*
>                  * If time for next balance is due,
>                  * do the balance.
>                  */
>                 if (time_after_eq(jiffies, rq->next_balance)) {
>                         struct rq_flags rf;
>
>                         rq_lock_irqsave(rq, &rf);
>                         update_rq_clock(rq);
>                         cpu_load_update_idle(rq);
>                         rq_unlock_irqrestore(rq, &rf);
>
>                         if (flags & NOHZ_BALANCE_KICK)
>                                 rebalance_domains(rq, CPU_IDLE);
>                 }
>
>                 if (time_after(next_balance, rq->next_balance)) {
>                         next_balance = rq->next_balance;
>                         update_next_balance = 1;
>                 }
>
>
> Now that I think about it, shouldn't we always have a 'continue' after
> the blocked load update if (flags & NOHZ_KICK_MASK) == NOHZ_STATS_KICK ?
> AFAICT we don't want to push the next_balance forward, only the next_blocked.

But we don't push next_balance forward. It just get the shortest
next_balance and update nohz.next_balance exactly like what is done in
full idle load balance

> That would also take care of not doing the load balance.
>>>
>>>                 /*
>>>                  * This cpu doesn't have any remaining blocked load, skip it.
>>>                  * It's sane to do this because this flag is raised in
>>>                  * nohz_balance_enter_idle()
>>>                  */
>>>                 if ((flags & NOHZ_KICK_MASK) == NOHZ_STATS_KICK &&
>>>                     !rq->has_blocked_load)
>>>                         continue;

Then, it's worth keeping the call to cpu_load_update_idle(rq); which
update the cpu_load[] array which is still used at some level

>>>
>>>> +             update_blocked_averages(rq->cpu);
>>>> +             has_blocked_load |= rq->has_blocked_load;
>>>> +