[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <jhjr1ubccg2.mognet@arm.com>
Date: Fri, 19 Jun 2020 16:17:17 +0100
From: Valentin Schneider <valentin.schneider@....com>
To: Qais Yousef <qais.yousef@....com>
Cc: Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Patrick Bellasi <patrick.bellasi@...bug.net>,
Chris Redpath <chrid.redpath@....com>,
Lukasz Luba <lukasz.luba@....com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] sched/uclamp: Protect uclamp fast path code with static key
On 19/06/20 15:13, Qais Yousef wrote:
> On 06/19/20 14:25, Valentin Schneider wrote:
>>
>> On 19/06/20 13:51, Qais Yousef wrote:
>> > On 06/19/20 11:36, Valentin Schneider wrote:
>> >>
>> >> On 18/06/20 20:55, Qais Yousef wrote:
>> >> > There is a report that when uclamp is enabled, a netperf UDP test
>> >> > regresses compared to a kernel compiled without uclamp.
>> >> >
>> >> > https://lore.kernel.org/lkml/20200529100806.GA3070@suse.de/
>> >> >
>> >>
>> >> ISTR the perennial form for those is: https://lkml.kernel.org/r/<message-id>
>> >
>> > The link is correct permalinnk from lore and contains the message-id as Peter
>> > likes and he has accepted this form before.
>> >
>>
>> I think the objections I remember were on using lkml.org rather than
>> lkml.kernel.org. Sorry!
>>
>> > If you look closely you'll see that what you suggest is just moving 'lkml' to
>> > replace lore in the dns name and put an /r/. I don't see a need to enforce one
>> > form over the other as the one I used is much easier to get.
>> >
>>
>> My assumption would be that while lore may fade (it hasn't been there for
>> that long, who knows what will come next), lkml.kernel.org ought to be
>> perennial. Keyword here being "assumption".
>>
>> > If Peter really insists I'll be happy to change.
>> >
>> > [...]
>> >
>> >> > + * This could happen if sched_uclamp_unused was disabled while the
>> >> > + * current task was running, hence we could end up with unbalanced call
>> >> > + * to uclamp_rq_dec_id().
>> >> > + */
>> >> > + if (unlikely(!bucket->tasks))
>> >> > + return;
>> >>
>> >> I'm slightly worried about silent returns for cases like these, can we try
>> >> to cook something up to preserve the previous SCHED_WARN_ON()? Say,
>> >> something like the horrendous below - alternatively might be feasible with
>> >> with some clever p->on_rq flag.
>> >
>> > I am really against extra churn and debug code to detect an impossible case
>> > that is not really tricky for reviewers to discern. Outside of enqueue/dequeue
>> > path, it's only used in update_uclamp_active(). It is quite easy to see that
>> > it's impossible, except for the legit case now when we have a static key
>> > changing when a task is running.
>> >
>>
>> Providing it isn't too much of a head scratcher (and admittedly what I am
>> suggesting is borderline here), I believe it is worthwhile to add debug
>> helps in what is assumed to be impossible cases - even more so in this case
>> seeing as it had been deemed worth to check previously. We've been proved
>> wrong on the "impossible" nature of some things before.
>>
>> We have a few of those checks strewn over the scheduler code, so it's not
>> like we would be starting a new trend.
>
> I am sorry I am still not bought in. I think the parts you're talking about are
> in the lockless part of the scheduler which are really hard to debug as several
> cpus could be traversing these data from different code paths.
Not necessarily just those, see pick_next_task(),
active_load_balance_cpu_stop(), or a good proportion of SCHED_WARN_ON()'s.
> But here this is
> just extra churn.
>
> If an imbalance has happend this means either:
>
> 1. enqueue/dequeue_task() is imablanced itself
> 2. uclamp_update_active() calls dec without inc.
>
> If 1 happened we have more reasons to be worried about. For 2 the function
> takes task_rq_lock() and does dec/inc in obvious way.
>
True. I won't argue over the feasibility of the scenarios we are currently
aware of, my point was that if they do happen, it's nice to have debug
helps in the right places as the final breakage can happen much further
downstream.
FWIW I don't like the diff I suggested at all, but if we can come up with a
cleverer scheme I think we should do it, as per the above.
> So I don't see any reason to add new info in task_struct and sprinkle #ifdefs
> to protect against something that I can't see we can't reason correctly about
> now.
>
> We don't use pr_debug() in scheduler (I guess no computer would have booted
> with that on), otherwise that would have been a good candidate for one, yes.
> But we can't do that.
>
> Thanks
Powered by blists - more mailing lists