[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180912173515.GH1413@e110439-lin>
Date: Wed, 12 Sep 2018 18:35:15 +0100
From: Patrick Bellasi <patrick.bellasi@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
Ingo Molnar <mingo@...hat.com>, Tejun Heo <tj@...nel.org>,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
Viresh Kumar <viresh.kumar@...aro.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Paul Turner <pjt@...gle.com>,
Quentin Perret <quentin.perret@....com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Morten Rasmussen <morten.rasmussen@....com>,
Juri Lelli <juri.lelli@...hat.com>,
Todd Kjos <tkjos@...gle.com>,
Joel Fernandes <joelaf@...gle.com>,
Steve Muckle <smuckle@...gle.com>,
Suren Baghdasaryan <surenb@...gle.com>
Subject: Re: [PATCH v4 02/16] sched/core: uclamp: map TASK's clamp values
into CPU's clamp groups
On 12-Sep 18:12, Peter Zijlstra wrote:
> On Wed, Sep 12, 2018 at 04:56:19PM +0100, Patrick Bellasi wrote:
> > On 12-Sep 15:49, Peter Zijlstra wrote:
> > > On Tue, Aug 28, 2018 at 02:53:10PM +0100, Patrick Bellasi wrote:
>
> > > > +/**
> > > > + * uclamp_map: reference counts a utilization "clamp value"
> > > > + * @value: the utilization "clamp value" required
> > > > + * @se_count: the number of scheduling entities requiring the "clamp value"
> > > > + * @se_lock: serialize reference count updates by protecting se_count
> > >
> > > Why do you have a spinlock to serialize a single value? Don't we have
> > > atomics for that?
> >
> > There are some code paths where it's used to protect clamp groups
> > mapping and initialization, e.g.
> >
> > uclamp_group_get()
> > spin_lock()
> > // initialize clamp group (if required) and then...
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This is actually a couple of function calls
> > se_count += 1
> > spin_unlock()
> >
> > Almost all these paths are triggered from user-space and protected
> > by a global uclamp_mutex, but fork/exit paths.
> >
> > To serialize these paths I'm using the spinlock above, does it make
> > sense ? Can we use the global uclamp_mutex on forks/exit too ?
>
> OK, then your comment is misleading; it serializes both fields.
Yes... that definitively needs an update.
> > One additional observations is that, if in the future we want to add a
> > kernel space API, (e.g. driver asking for a new clamp value), maybe we
> > will need to have a serialized non-sleeping uclamp_group_get() API ?
>
> No idea; but if you want to go all fancy you can replace he whole
> uclamp_map thing with something like:
>
> struct uclamp_map {
> union {
> struct {
> unsigned long v : 10;
> unsigned long c : BITS_PER_LONG - 10;
> };
> atomic_long_t s;
> };
> };
That sounds really cool and scary at the same time :)
The v:10 requires that we never set SCHED_CAPACITY_SCALE>1024
or that we use it to track a percentage value (i.e. [0..100]).
One of the last patches introduces percentage values to userspace.
But, I was considering that in kernel space we should always track
full scale utilization values.
The c:(BITS_PER_LONG-10) restricts the range of concurrently active
SE refcounting the same clamp value. Which, for some 32bit systems is
only 4 milions among tasks and cgroups... maybe still reasonable...
> And use uclamp_map::c == 0 as unused (as per normal refcount
> semantics) and atomic_long_cmpxchg() the whole thing using
> uclamp_map::s.
Yes... that could work for the uclamp_map updates, but as I noted
above, I think I have other calls serialized by that lock... will look
better into what you suggest, thanks!
[...]
> > > What's the purpose of that cacheline align statement?
> >
> > In uclamp_maps, we still need to scan the array when a clamp value is
> > changed from user-space, i.e. the cases reported above. Thus, that
> > alignment is just to ensure that we minimize the number of cache lines
> > used. Does that make sense ?
> >
> > Maybe that alignment implicitly generated by the compiler ?
>
> It is not, but if it really is a slow path, we shouldn't care about
> alignment.
Ok, will remove it.
> > > Note that without that apparently superfluous lock, it would be 8*12 =
> > > 96 bytes, which is 1.5 lines and would indeed suggest you default to
> > > GROUP_COUNT=7 by default to fill 2 lines.
> >
> > Yes, will check better if we can count on just the uclamp_mutex
>
> Well, if we don't care about performance (slow path) then keeping he
> lock is fine, just the comment and alignment are misleading.
Ok
[...]
Cheers,
Patrick
--
#include <best/regards.h>
Patrick Bellasi
Powered by blists - more mailing lists