[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210611132653.o5iljqtmr2hcvtsl@e107158-lin.cambridge.arm.com>
Date: Fri, 11 Jun 2021 14:26:53 +0100
From: Qais Yousef <qais.yousef@....com>
To: Quentin Perret <qperret@...gle.com>
Cc: mingo@...hat.com, peterz@...radead.org, vincent.guittot@...aro.org,
dietmar.eggemann@....com, rickyiu@...gle.com, wvw@...gle.com,
patrick.bellasi@...bug.net, xuewen.yan94@...il.com,
linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH v2 3/3] sched: Make uclamp changes depend on CAP_SYS_NICE
Hi Quentin
On 06/11/21 13:08, Quentin Perret wrote:
> Hi Qais,
>
> On Friday 11 Jun 2021 at 13:48:20 (+0100), Qais Yousef wrote:
> > On 06/10/21 15:13, Quentin Perret wrote:
> > > There is currently nothing preventing tasks from changing their per-task
> > > clamp values in anyway that they like. The rationale is probably that
> > > system administrators are still able to limit those clamps thanks to the
> > > cgroup interface. However, this causes pain in a system where both
> > > per-task and per-cgroup clamp values are expected to be under the
> > > control of core system components (as is the case for Android).
> > >
> > > To fix this, let's require CAP_SYS_NICE to increase per-task clamp
> > > values. This allows unprivileged tasks to lower their requests, but not
> > > increase them, which is consistent with the existing behaviour for nice
> > > values.
> >
> > Hmmm. I'm not in favour of this.
> >
> > So uclamp is a performance and power management mechanism, it has no impact on
> > fairness AFAICT, so it being a privileged operation doesn't make sense.
> >
> > We had a thought about this in the past and we didn't think there's any harm if
> > a task (app) wants to self manage. Yes a task could ask to run at max
> > performance and waste power, but anyone can generate a busy loop and waste
> > power too.
> >
> > Now that doesn't mean your use case is not valid. I agree if there's a system
> > wide framework that wants to explicitly manage performance and power of tasks
> > via uclamp, then we can end up with 2 layers of controls overriding each
> > others.
>
> Right, that's the main issue. Also, the reality is that most of time the
> 'right' clamps are platform-dependent, so most userspace apps are simply
> not equipped to decide what their own clamps should be.
I'd argue this is true for both a framework or an app point of view. It depends
on the application and how it would be used.
I can foresee for example and HTTP server wanting to use uclamp to guarantee
a QoS target ie: X number of requests per second or a maximum of Y tail
latency. The application can try to tune (calibrate) itself without having to
have the whole system tuned or pumped on steroid.
Or a framework could manage this on behalf of the application. Both can use
uclamp with a feedback loop to calibrate the perf requirement of the tasks to
meet a given perf/power criteria.
If you want to do a static management, system framework would make more sense
in this case, true.
>
> > Would it make more sense to have a procfs/sysfs flag that is disabled by
> > default that allows sys-admin to enforce a privileged uclamp access?
> >
> > Something like
> >
> > /proc/sys/kernel/sched_uclamp_privileged
>
> Hmm, dunno, I'm not aware of anything else having a behaviour like that,
> so that feels a bit odd.
I think /proc/sys/kernel/perf_event_paranoid falls into this category.
>
> > I think both usage scenarios are valid and giving sys-admins the power to
> > enforce a behavior makes more sense for me.
>
> Yes, I wouldn't mind something like that in general. I originally wanted
> to suggest introducing a dedicated capability for uclamp, but that felt
> a bit overkill. Now if others think this should be the way to go I'm
> happy to go implement it.
Would be good to hear what others think for sure :)
Cheers
--
Qais Yousef
Powered by blists - more mailing lists