linux-kernel - Re: [PATCH] sched/fair: Avoid divide by zero when rebalancing domains

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <adfca8d9-2fc1-744d-ec03-49788c5b3aa2@arm.com>
Date:   Fri, 17 Aug 2018 13:58:53 +0100
From:   Valentin Schneider <valentin.schneider@....com>
To:     Matt Fleming <matt@...eblueprint.co.uk>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...nel.org>,
        Mike Galbraith <umgwanakikbuti@...il.com>
Subject: Re: [PATCH] sched/fair: Avoid divide by zero when rebalancing domains

Hi,

On 17/08/18 11:27, Matt Fleming wrote:
> On Thu, 05 Jul, at 05:54:02PM, Valentin Schneider wrote:
>> On 05/07/18 14:27, Matt Fleming wrote:
>>> On Thu, 05 Jul, at 11:10:42AM, Valentin Schneider wrote:
>>>> Hi,
>>>>
>>>> On 04/07/18 15:24, Matt Fleming wrote:
>>>>> It's possible that the CPU doing nohz idle balance hasn't had its own
>>>>> load updated for many seconds. This can lead to huge deltas between
>>>>> rq->avg_stamp and rq->clock when rebalancing, and has been seen to
>>>>> cause the following crash:
>>>>>
>>>>>  divide error: 0000 [#1] SMP
>>>>>  Call Trace:
>>>>>   [<ffffffff810bcba8>] update_sd_lb_stats+0xe8/0x560
>>
>> My confusion comes from not seeing where that crash happens. Would you mind
>> sharing the associated line number? I can feel the "how did I not see this"
>> from there but it can't be helped :(
> 
> The divide by zero comes from scale_rt_capacity() where 'total' is a
> u64 but gets truncated when passed to div_u64() since the divisor
> parameter is u32.
> 

Ah, nasty one. Interestingly enough that bit has been changed quite recently,
so I don't think you can get a div by 0 in there anymore - see
523e979d3164 ("sched/core: Use PELT for scale_rt_capacity()") and subsequent
cleanups.

> Sure, you could use div64_u64() instead, but the real issue is that
> the load hasn't been updated for a very long time and that we're
> trying to balance the domains with stale data.
> 

Yeah I agree with that. However, the problem is with cpu_load - blocked load
on nohz CPUs will be periodically updated until entirely decayed. And if we
end up getting rid of cpu_load (depends on how [1] goes), then there's
nothing left to do. But we're not there yet...

[1]: https://lore.kernel.org/lkml/20180809135753.21077-1-dietmar.eggemann@arm.com/