linux-kernel - Re: [PATCH v5 1/3] sched: Stop nohz stats when decayed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a351f2de-3a00-1cf8-b16f-59493c5b67f5@arm.com>
Date:   Wed, 21 Feb 2018 13:13:52 +0000
From:   Valentin Schneider <valentin.schneider@....com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Morten Rasmussen <morten.rasmussen@...s.arm.com>,
        Brendan Jackman <brendan.jackman@....com>,
        Dietmar Eggemann <dietmar.eggemann@....com>
Subject: Re: [PATCH v5 1/3] sched: Stop nohz stats when decayed

On 02/16/2018 01:44 PM, Vincent Guittot wrote:
> On 16 February 2018 at 13:13, Valentin Schneider
> <valentin.schneider@....com> wrote:
>> On 02/14/2018 03:26 PM, Vincent Guittot wrote:
>>> Stopped the periodic update of blocked load when all idle CPUs have fully
>>> decayed. We introduce a new nohz.has_blocked that reflect if some idle
>>> CPUs has blocked load that have to be periodiccally updated. nohz.has_blocked
>>> is set everytime that a Idle CPU can have blocked load and it is then clear
>>> when no more blocked load has been detected during an update. We don't need
>>> atomic operation but only to make cure of the right ordering when updating
>>> nohz.idle_cpus_mask and nohz.has_blocked.
>>>
>>> Suggested-by: Peter Zijlstra (Intel) <peterz@...radead.org>
>>> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
>>> ---
>>>  kernel/sched/fair.c  | 122 ++++++++++++++++++++++++++++++++++++++++++---------
>>>  kernel/sched/sched.h |   1 +
>>>  2 files changed, 102 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 7af1fa9..5a6835e 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>>
>>> [...]
>>> @@ -9383,11 +9452,16 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle)
>>>                * work being done for other cpus. Next load
>>>                * balancing owner will pick it up.
>>>                */
>>> -             if (need_resched())
>>> -                     break;
>>> +             if (need_resched()) {
>>> +                     has_blocked_load = true;
>>> +                     goto abort;
>>> +             }
>>>
>>>               rq = cpu_rq(balance_cpu);
>>>
>>
>> I'd say it's safe to do the following here. The flag is raised in
>> nohz_balance_enter_idle() before the smp_mb(), so we won't skip a CPU
>> that just got added to nohz.idle_cpus_mask.
> 
> rq->has_blocked_load will be set before the barrier only if
> nohz_tick_stopped is not already set,
> Otherwise, we skip cpumask update and the barrier in  nohz_balance_enter_idle
> 

Right, forgot about that bit. I think it's still fine because:
- nohz_balance_enter_idle() can't be called before the last running task is
dequeued, which requires the rq lock.
- update_blocked_averages takes the rq lock and clears rq->has_blocked_load
with the lock held.

So even though we could have some very unlikely scenario where a CPU quickly
goes out/in of idle after nohz.idle_cpus_mask has been read, the blocked load
itself is safe so rq->has_blocked_load will end up being set correctly.
(Took me a while to see it that way)


BTW, with the current set on Peter's sched/testing, update_nohz_stats()
is called here, which doesn't do the update if !rq->has_blocked_load
(Although that check is done without lock/barrier, so maybe we could not see
a CPU that just went idle ?)

I have one more question on that bit:


		has_blocked_load |= update_nohz_stats(rq, true);

		/*
		 * If time for next balance is due,
		 * do the balance.
		 */
		if (time_after_eq(jiffies, rq->next_balance)) {
			struct rq_flags rf;

			rq_lock_irqsave(rq, &rf);
			update_rq_clock(rq);
			cpu_load_update_idle(rq);
			rq_unlock_irqrestore(rq, &rf);

			if (flags & NOHZ_BALANCE_KICK)
				rebalance_domains(rq, CPU_IDLE);
		}

		if (time_after(next_balance, rq->next_balance)) {
			next_balance = rq->next_balance;
			update_next_balance = 1;
		}


Now that I think about it, shouldn't we always have a 'continue' after
the blocked load update if (flags & NOHZ_KICK_MASK) == NOHZ_STATS_KICK ?
AFAICT we don't want to push the next_balance forward, only the next_blocked.
That would also take care of not doing the load balance.
>>
>>                 /*
>>                  * This cpu doesn't have any remaining blocked load, skip it.
>>                  * It's sane to do this because this flag is raised in
>>                  * nohz_balance_enter_idle()
>>                  */
>>                 if ((flags & NOHZ_KICK_MASK) == NOHZ_STATS_KICK &&
>>                     !rq->has_blocked_load)
>>                         continue;
>>
>>> +             update_blocked_averages(rq->cpu);
>>> +             has_blocked_load |= rq->has_blocked_load;
>>> +