linux-kernel - Re: [PATCH v5 03/10] cpufreq/schedutil: add rt utilization tracking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180531084607.GB17937@localhost.localdomain>
Date:   Thu, 31 May 2018 10:46:07 +0200
From:   Juri Lelli <juri.lelli@...hat.com>
To:     Quentin Perret <quentin.perret@....com>
Cc:     Vincent Guittot <vincent.guittot@...aro.org>, peterz@...radead.org,
        mingo@...nel.org, linux-kernel@...r.kernel.org, rjw@...ysocki.net,
        dietmar.eggemann@....com, Morten.Rasmussen@....com,
        viresh.kumar@...aro.org, valentin.schneider@....com
Subject: Re: [PATCH v5 03/10] cpufreq/schedutil: add rt utilization tracking

On 30/05/18 17:46, Quentin Perret wrote:
> Hi Vincent,
> 
> On Friday 25 May 2018 at 15:12:24 (+0200), Vincent Guittot wrote:
> > Add both cfs and rt utilization when selecting an OPP for cfs tasks as rt
> > can preempt and steal cfs's running time.
> > 
> > Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
> > ---
> >  kernel/sched/cpufreq_schedutil.c | 14 +++++++++++---
> >  1 file changed, 11 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > index 28592b6..a84b5a5 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -56,6 +56,7 @@ struct sugov_cpu {
> >  	/* The fields below are only needed when sharing a policy: */
> >  	unsigned long		util_cfs;
> >  	unsigned long		util_dl;
> > +	unsigned long		util_rt;
> >  	unsigned long		max;
> >  
> >  	/* The field below is for single-CPU policies only: */
> > @@ -178,14 +179,21 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu)
> >  	sg_cpu->max = arch_scale_cpu_capacity(NULL, sg_cpu->cpu);
> >  	sg_cpu->util_cfs = cpu_util_cfs(rq);
> >  	sg_cpu->util_dl  = cpu_util_dl(rq);
> > +	sg_cpu->util_rt  = cpu_util_rt(rq);
> >  }
> >  
> >  static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
> >  {
> >  	struct rq *rq = cpu_rq(sg_cpu->cpu);
> > +	unsigned long util;
> >  
> > -	if (rq->rt.rt_nr_running)
> > -		return sg_cpu->max;
> > +	if (rq->rt.rt_nr_running) {
> > +		util = sg_cpu->max;
> 
> So I understand why we want to got to max freq when a RT task is running,
> but I think there are use cases where we might want to be more conservative
> and use the util_avg of the RT rq instead. The first use case is
> battery-powered devices where going to max isn't really affordable from
> an energy standpoint. Android, for example, has been using a RT
> utilization signal to select OPPs for quite a while now, because going
> to max blindly is _very_ expensive.
> 
> And the second use-case is thermal pressure. On some modern CPUs, going to
> max freq can lead to stringent thermal capping very quickly, at the
> point where your CPUs might not have enough capacity to serve your tasks
> properly. And that can ultimately hurt the very RT tasks you originally
> tried to run fast. In these systems, in the long term, you'd be better off
> not asking for more than what you really need ...

Proposed the same at last LPC. Peter NAKed it (since RT is all about
meeting deadlines, and when using FIFO/RR we don't really know how fast
the CPU should go to meet them, so go to max is the only safe decision).

> So what about having a sched_feature to select between going to max and
> using the RT util_avg ? Obviously the default should keep the current
> behaviour.

Peter, would SCHED_FEAT make a difference? :)

Or Patrick's utilization capping applied to RT..