[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190122103107.vulmhrucl3vtjiop@e110439-lin>
Date: Tue, 22 Jan 2019 10:31:07 +0000
From: Patrick Bellasi <patrick.bellasi@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
linux-api@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
Tejun Heo <tj@...nel.org>,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Viresh Kumar <viresh.kumar@...aro.org>,
Paul Turner <pjt@...gle.com>,
Quentin Perret <quentin.perret@....com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Morten Rasmussen <morten.rasmussen@....com>,
Juri Lelli <juri.lelli@...hat.com>,
Todd Kjos <tkjos@...gle.com>,
Joel Fernandes <joelaf@...gle.com>,
Steve Muckle <smuckle@...gle.com>,
Suren Baghdasaryan <surenb@...gle.com>
Subject: Re: [PATCH v6 04/16] sched/core: uclamp: Add CPU's clamp buckets
refcounting
On 22-Jan 10:45, Peter Zijlstra wrote:
> On Mon, Jan 21, 2019 at 04:33:38PM +0000, Patrick Bellasi wrote:
> > On 21-Jan 17:12, Peter Zijlstra wrote:
> > > On Mon, Jan 21, 2019 at 03:23:11PM +0000, Patrick Bellasi wrote:
>
> > > > and keep all
> > > > the buckets in use at the beginning of a cache line.
> > >
> > > That; is that the rationale for all this? Note that per the defaults
> > > everything is in a single line already.
> >
> > Yes, that's because of the loop in:
> >
> > dequeue_task()
> > uclamp_cpu_dec()
> > uclamp_cpu_dec_id()
> > uclamp_cpu_update()
> >
> > where buckets needs sometimes to be scanned to find a new max.
> >
> > Consider also that, with mapping, we can more easily increase the
> > buckets count to 20 in order to have a finer clamping granularity if
> > needed without warring too much about performance impact especially
> > when we use anyway few different clamp values.
> >
> > So, I agree that mapping adds (code) complexity but it can also save
> > few cycles in the fast path... do you think it's not worth the added
> > complexity?
>
> Then maybe split this out in a separate patch? Do the trivial linear
> bucket thing first and then do this smarty pants thing on top.
>
> One problem with the scheme is that it doesn't defrag; so if you get a
> peak usage, you can still end up with only two active buckets in
> different lines.
You right, that was saved for a later optimization. :/
Mainly in consideration of the fact that, at least for the main usage
we have in mind on Android, we will likely configure all the required
clamps once for all at boot time.
> Also; if it is it's own patch, you get a much better view of the
> additional complexity and a chance to justify it ;-)
What about ditching the mapping for the time being and see if we
get a real overhead hit in the future ?
At that point we will revamp the mapping patch with also a proper
defrag support.
> Also; would it make sense to do s/cpu/rq/ on much of this? All this
> uclamp_cpu_*() stuff really is per rq and takes rq arguments, so why
> does it have cpu in the name... no strong feelings, just noticed it and
> thought is a tad inconsistent.
The idea behind using "cpu" instead of "rq" was that we use those only at
root rq level and the clamps are aggregated per-CPU.
I remember one of the first versions used "cpu" instead of "rq" as a
parameter name and you proposed to change it as an optimization since
we call it from dequeue_task() where we already have a *rq.
... but, since we have those uclamp data within struct rq, I think you
are right: it makes more sense to rename the functions.
Will do it in v7, thanks.
--
#include <best/regards.h>
Patrick Bellasi
Powered by blists - more mailing lists