[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <434c550d-65da-1b41-b949-c91b9cfdd127@arm.com>
Date: Thu, 16 Aug 2018 17:00:44 +0200
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Quentin Perret <quentin.perret@....com>
Cc: Patrick Bellasi <patrick.bellasi@....com>,
linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Tejun Heo <tj@...nel.org>,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
Viresh Kumar <viresh.kumar@...aro.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Paul Turner <pjt@...gle.com>,
Morten Rasmussen <morten.rasmussen@....com>,
Juri Lelli <juri.lelli@...hat.com>,
Todd Kjos <tkjos@...gle.com>,
Joel Fernandes <joelaf@...gle.com>,
Steve Muckle <smuckle@...gle.com>,
Suren Baghdasaryan <surenb@...gle.com>
Subject: Re: [PATCH v3 03/14] sched/core: uclamp: add CPU's clamp groups
accounting
On 08/16/2018 04:21 PM, Quentin Perret wrote:
> On Thursday 16 Aug 2018 at 15:45:45 (+0200), Dietmar Eggemann wrote:
>> On 08/16/2018 03:37 PM, Quentin Perret wrote:
>>>>> IMHO, if this is something which should not happen at all, a BUG_ON() is the
>>>>> right thing to do here.
>>>>
>>>> I don't agree on that. I agree it should not happen but since it's a
>>>> recoverable error it think we should not panic.
>>>
>>> FWIW, if this is a recoverable error, I think Linus will agree with
>>> Patrick on this one :-)
>>>
>>> https://lkml.org/lkml/2016/10/4/1
>>
>> Yeah, not really agreeing here that this is a recoverable error.
>
> A non-recoverable scenario could be, for example, if you corrupt your
> stack and there is absolutely _nothing_ you can do to keep the system up
> and running, because it's just too broken. I don't feel like we're
> talking about such an extreme case here ...
Yeah, that's the extreme. But what about this lovely BUG_ON(busiest ==
env.dst_rq) in fair.c's load_balance()?
We could recover by just bailing out ;-)
I guess we know by now that there are different opinions here.
>
>> Besides, we
>> only consider under-run here, what about over-run?
Important thing is to also detect the over-run, i.e. add the first task
and the task counter is already > 0.
>>
>> Currently this warning doesn't hit and if the code will be changed and it
>> hits, I still find a BUG_ON more appealing here ...
>>
>> So this error scenario can happen over and over again and we always recover
>> from ? The important thing is that we find the culprit for this behaviour as
>> fast as possible ...
>
> Agreed, we want to debug that ASAP, but WARN should let us do that just
> fine, I think.
+1.
Powered by blists - more mailing lists