linux-kernel - Re: [PATCH v2 3/3] sched: Make uclamp changes depend on CAP_SYS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210611132653.o5iljqtmr2hcvtsl@e107158-lin.cambridge.arm.com>
Date:   Fri, 11 Jun 2021 14:26:53 +0100
From:   Qais Yousef <qais.yousef@....com>
To:     Quentin Perret <qperret@...gle.com>
Cc:     mingo@...hat.com, peterz@...radead.org, vincent.guittot@...aro.org,
        dietmar.eggemann@....com, rickyiu@...gle.com, wvw@...gle.com,
        patrick.bellasi@...bug.net, xuewen.yan94@...il.com,
        linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH v2 3/3] sched: Make uclamp changes depend on CAP_SYS_NICE

Hi Quentin

On 06/11/21 13:08, Quentin Perret wrote:
> Hi Qais,
> 
> On Friday 11 Jun 2021 at 13:48:20 (+0100), Qais Yousef wrote:
> > On 06/10/21 15:13, Quentin Perret wrote:
> > > There is currently nothing preventing tasks from changing their per-task
> > > clamp values in anyway that they like. The rationale is probably that
> > > system administrators are still able to limit those clamps thanks to the
> > > cgroup interface. However, this causes pain in a system where both
> > > per-task and per-cgroup clamp values are expected to be under the
> > > control of core system components (as is the case for Android).
> > > 
> > > To fix this, let's require CAP_SYS_NICE to increase per-task clamp
> > > values. This allows unprivileged tasks to lower their requests, but not
> > > increase them, which is consistent with the existing behaviour for nice
> > > values.
> > 
> > Hmmm. I'm not in favour of this.
> > 
> > So uclamp is a performance and power management mechanism, it has no impact on
> > fairness AFAICT, so it being a privileged operation doesn't make sense.
> > 
> > We had a thought about this in the past and we didn't think there's any harm if
> > a task (app) wants to self manage. Yes a task could ask to run at max
> > performance and waste power, but anyone can generate a busy loop and waste
> > power too.
> > 
> > Now that doesn't mean your use case is not valid. I agree if there's a system
> > wide framework that wants to explicitly manage performance and power of tasks
> > via uclamp, then we can end up with 2 layers of controls overriding each
> > others.
> 
> Right, that's the main issue. Also, the reality is that most of time the
> 'right' clamps are platform-dependent, so most userspace apps are simply
> not equipped to decide what their own clamps should be.

I'd argue this is true for both a framework or an app point of view. It depends
on the application and how it would be used.

I can foresee for example and HTTP server wanting to use uclamp to guarantee
a QoS target ie: X number of requests per second or a maximum of Y tail
latency. The application can try to tune (calibrate) itself without having to
have the whole system tuned or pumped on steroid.

Or a framework could manage this on behalf of the application. Both can use
uclamp with a feedback loop to calibrate the perf requirement of the tasks to
meet a given perf/power criteria.

If you want to do a static management, system framework would make more sense
in this case, true.

> 
> > Would it make more sense to have a procfs/sysfs flag that is disabled by
> > default that allows sys-admin to enforce a privileged uclamp access?
> > 
> > Something like
> > 
> > 	/proc/sys/kernel/sched_uclamp_privileged
> 
> Hmm, dunno, I'm not aware of anything else having a behaviour like that,
> so that feels a bit odd.

I think /proc/sys/kernel/perf_event_paranoid falls into this category.

> 
> > I think both usage scenarios are valid and giving sys-admins the power to
> > enforce a behavior makes more sense for me.
> 
> Yes, I wouldn't mind something like that in general. I originally wanted
> to suggest introducing a dedicated capability for uclamp, but that felt
> a bit overkill. Now if others think this should be the way to go I'm
> happy to go implement it.

Would be good to hear what others think for sure :)


Cheers

--
Qais Yousef