lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200114213403.cur6gydan6kmplqb@e107158-lin.cambridge.arm.com>
Date:   Tue, 14 Jan 2020 21:34:03 +0000
From:   Qais Yousef <qais.yousef@....com>
To:     Patrick Bellasi <patrick.bellasi@...bug.net>
Cc:     Valentin Schneider <valentin.schneider@....com>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Luis Chamberlain <mcgrof@...nel.org>,
        Kees Cook <keescook@...omium.org>,
        Iurii Zaikin <yzaikin@...gle.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        qperret@...gle.com, linux-kernel@...r.kernel.org,
        linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min

On 01/09/20 10:21, Patrick Bellasi wrote:
> That's not entirely true. In that patch we introduce cgroup support
> but, if you look at the code before that patch, for CFS tasks there is
> only:
>  - CFS task-specific values (min,max)=(0,1024) by default
>  - CFS system-wide tunables (min,max)=(1024,1024) by default
> and a change on the system-wide tunable allows for example to enforce
> a uclamp_max=200 on all tasks.
> 
> A similar solution can be implemented for RT tasks, where we have:
>  - RT task-specific values (min,max)=(1024,1024) by default
>  - RT system-wide tunables (min,max)=(1024,1024) by default
>  and a change on the system-wide tunable allows for example to enforce
>  a uclamp_min=200 on all tasks.

I feel I'm already getting lost in the complexity of the interaction here. Do
we really need to go that path?

So we will end up with a default system wide for all tasks + a CFS system wide
default + an RT system wide default?

As I understand it, we have a single system wide default now.

>  
> > (Would we need CONFIG_RT_GROUP_SCHED for this? IIRC there's a few pain points
> > when turning it on, but I think we don't have to if we just want things like
> > uclamp value propagation?)
> 
> No, the current design for CFS tasks works also on !CONFIG_CFS_GROUP_SCHED.
> That's because in this case:
>   - uclamp_tg_restrict() returns just the task requested value
>   - uclamp_eff_get() _always_ restricts the requested value considering
>     the system defaults
>  
> > It's quite more work than the simple thing Qais is introducing (and on both
> > user and kernel side).
> 
> But if in the future we will want to extend CGroups support to RT then
> we will feel the pains because we do the effective computation in two
> different places.

Hmm what you're suggesting here is that we want to have
cpu.rt.uclamp.{min,max}? I'm not sure I can agree this is a good idea.

It makes more sense to create a special group for all rt tasks rather than
treat rt tasks in a cgroup differently.

> 
> Do note that a proper CGroup support requires that the system default
> values defines the values for the root group and are consistently
> propagated down the hierarchy. Thus we need to add a dedicated pair of
> cgroup attributes, e.g. cpu.util.rt.{min.max}.
> 
> To recap, we don't need CGROUP support right now but just to add a new
> default tracking similar to what we do for CFS.
> 
> We already proposed such a support in one of the initial versions of
> the uclamp series:
>    Message-ID: <20190115101513.2822-10-patrick.bellasi@....com>
>    https://lore.kernel.org/lkml/20190115101513.2822-10-patrick.bellasi@arm.com/

IIUC what you're suggesting is:

	1. Use the sysctl to specify the default_rt_uclamp_min
	2. Enforce this value in uclamp_eff_get() rather than my sync logic
	3. Remove the current hack to always set
	   rt_task->uclamp_min = uclamp_none(UCLAMP_MAX)

If I got it correctly I'd be happy to experiment with it if this is what
you're suggesting. Otherwise I'm afraid I'm failing to see the crust of the
problem you're trying to highlight.

Thanks

--
Qais Yousef

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ