linux-kernel - Re: [PATCH v3 03/14] sched/core: uclamp: add CPU's clamp groups accounting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <434c550d-65da-1b41-b949-c91b9cfdd127@arm.com>
Date:   Thu, 16 Aug 2018 17:00:44 +0200
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Quentin Perret <quentin.perret@....com>
Cc:     Patrick Bellasi <patrick.bellasi@....com>,
        linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Tejun Heo <tj@...nel.org>,
        "Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Paul Turner <pjt@...gle.com>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Todd Kjos <tkjos@...gle.com>,
        Joel Fernandes <joelaf@...gle.com>,
        Steve Muckle <smuckle@...gle.com>,
        Suren Baghdasaryan <surenb@...gle.com>
Subject: Re: [PATCH v3 03/14] sched/core: uclamp: add CPU's clamp groups
 accounting

On 08/16/2018 04:21 PM, Quentin Perret wrote:
> On Thursday 16 Aug 2018 at 15:45:45 (+0200), Dietmar Eggemann wrote:
>> On 08/16/2018 03:37 PM, Quentin Perret wrote:
>>>>> IMHO, if this is something which should not happen at all, a BUG_ON() is the
>>>>> right thing to do here.
>>>>
>>>> I don't agree on that. I agree it should not happen but since it's a
>>>> recoverable error it think we should not panic.
>>>
>>> FWIW, if this is a recoverable error, I think Linus will agree with
>>> Patrick on this one :-)
>>>
>>> https://lkml.org/lkml/2016/10/4/1
>>
>> Yeah, not really agreeing here that this is a recoverable error.
> 
> A non-recoverable scenario could be, for example, if you corrupt your
> stack and there is absolutely _nothing_ you can do to keep the system up
> and running, because it's just too broken. I don't feel like we're
> talking about such an extreme case here ...

Yeah, that's the extreme. But what about this lovely BUG_ON(busiest == 
env.dst_rq) in fair.c's load_balance()?

We could recover by just bailing out ;-)

I guess we know by now that there are different opinions here.

> 
>> Besides, we
>> only consider under-run here, what about over-run?

Important thing is to also detect the over-run, i.e. add the first task 
and the task counter is already > 0.

>>
>> Currently this warning doesn't hit and if the code will be changed and it
>> hits, I still find a BUG_ON more appealing here ...
>>
>> So this error scenario can happen over and over again and we always recover
>> from ? The important thing is that we find the culprit for this behaviour as
>> fast as possible ...
> 
> Agreed, we want to debug that ASAP, but WARN should let us do that just
> fine, I think.

+1.