lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtD4og8CDZzVd-=o7agcchQe8Q6GMWgiz5bDfdAepnX9Wg@mail.gmail.com>
Date: Tue, 9 Jul 2024 09:49:28 +0200
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: Qais Yousef <qyousef@...alina.io>, "Rafael J. Wysocki" <rafael@...nel.org>, 
	Viresh Kumar <viresh.kumar@...aro.org>, Ingo Molnar <mingo@...nel.org>, 
	Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>, 
	Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, 
	Daniel Bristot de Oliveira <bristot@...hat.com>, Valentin Schneider <vschneid@...hat.com>, 
	Christian Loehle <christian.loehle@....com>, Hongyan Xia <hongyan.xia2@....com>, 
	John Stultz <jstultz@...gle.com>, linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6] sched: Consolidate cpufreq updates

On Fri, 5 Jul 2024 at 13:50, Dietmar Eggemann <dietmar.eggemann@....com> wrote:
>
> On 05/07/2024 02:22, Qais Yousef wrote:
> > On 07/04/24 12:12, Dietmar Eggemann wrote:
> >> On 28/06/2024 03:52, Qais Yousef wrote:
> >>> On 06/25/24 14:58, Dietmar Eggemann wrote:
> >>>
> >>>>> @@ -4917,6 +4927,84 @@ static inline void __balance_callbacks(struct rq *rq)
> >>>>>
> >>>>>  #endif
> >>>>>
> >>>>> +static __always_inline void
> >>>>> +__update_cpufreq_ctx_switch(struct rq *rq, struct task_struct *prev)
> >>>>> +{
> >>>>> +#ifdef CONFIG_CPU_FREQ
> >>>>> + if (prev && prev->dl.flags & SCHED_FLAG_SUGOV) {
> >>>>> +         /* Sugov just did an update, don't be too aggressive */
> >>>>> +         return;
> >>>>> + }
> >>>>> +
> >>>>> + /*
> >>>>> +  * RT and DL should always send a freq update. But we can do some
> >>>>> +  * simple checks to avoid it when we know it's not necessary.
> >>>>> +  *
> >>>>> +  * iowait_boost will always trigger a freq update too.
> >>>>> +  *
> >>>>> +  * Fair tasks will only trigger an update if the root cfs_rq has
> >>>>> +  * decayed.
> >>>>> +  *
> >>>>> +  * Everything else should do nothing.
> >>>>> +  */
> >>>>> + switch (current->policy) {
> >>>>> + case SCHED_NORMAL:
> >>>>> + case SCHED_BATCH:
> >>>>
> >>>> What about SCHED_IDLE tasks?
> >>>
> >>> I didn't think they matter from cpufreq perspective. These tasks will just run
> >>> at whatever the idle system is happen to be at and have no specific perf
> >>> requirement since they should only run when the system is idle which a recipe
> >>> for starvation anyway?
> >>
> >> Not sure we talk about the same thing here? idle_sched_class vs.
> >> SCHED_IDLE policy (FAIR task with a tiny weight of WEIGHT_IDLEPRIO).
> >
> > Yes I am referring to SCHED_IDLE policy too. What is your expectation? AFAIK
> > the goal of this policy to run when there's nothing else needs running.
>
> IMHO, SCHED_IDLE tasks fight with all the other FAIR task over the
> resource rq. I would include SCHED_IDLE into this switch statement next
> to SCHED_NORMAL and SCHED_BATCH.
> What do you do if only SCHED_IDLE FAIR tasks are runnable? They probably
> also want to have their CPU frequency needs adjusted.

I agree SCHED_IDLE means do not preempt SCHED_NORMAL and SCHED_BATCH
but not do run at a random frequency

>
> [...]
>
> >>>>> @@ -4766,11 +4738,8 @@ static inline void update_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
> >>>>>            */
> >>>>>           detach_entity_load_avg(cfs_rq, se);
> >>>>>           update_tg_load_avg(cfs_rq);
> >>>>> - } else if (decayed) {
> >>>>> -         cfs_rq_util_change(cfs_rq, 0);
> >>>>> -
> >>>>> -         if (flags & UPDATE_TG)
> >>>>> -                 update_tg_load_avg(cfs_rq);
> >>>>> + } else if (cfs_rq->decayed && (flags & UPDATE_TG)) {
> >>>>> +         update_tg_load_avg(cfs_rq);
> >>>>>   }
> >>>>>  }
> >>>>
> >>>> You set cfs_rq->decayed for each taskgroup level but you only reset it
> >>>> for the root cfs_rq in __update_cpufreq_ctx_switch() and task_tick_fair()?
> >>>
> >>> Yes. We only care about using it for root level. Tracking the information at
> >>> cfs_rq level is the most natural way to do it as this is what update_load_avg()
> >>> is acting on.
> >>
> >> But IMHO this creates an issue with those non-root cfs_rq's within
> >
> > I am not seeing the issue, could you expand on what is it?
>
> I tried to explained it in the 4 lines below. With a local 'decayed'
> update_cfs_rq_load_avg() and propagate_entity_load_avg() set it every
> time update_load_avg() gets called. And this then determines whether
> update_tg_load_avg() is called on this cfs_rq later in update_load_avg().
>
> The new code:
>
>   cfs_rq->decayed |= update_cfs_rq_load_avg() (*)
>   cfs_rq->decayed |= propagate_entity_load_avg()
>
> will not reset 'cfs_rq->decayed' for non-root cfs_rq's.
>
> (*) You changed this in v3 from:
>
>   cfs_rq->decayed  = update_cfs_rq_load_avg()
>
>
> >> update_load_avg() itself. They will stay decayed after cfs_rq->decayed
> >> has been set to 1 once and will never be reset to 0. So with UPDATE_TG
> >> update_tg_load_avg() will then always be called on those non-root
> >> cfs_rq's all the time.
> >
> > We could add a check to update only the root cfs_rq. But what do we gain? Or
> > IOW, what is the harm of unconditionally updating cfs_rq->decayed given that we
> > only care about the root cfs_rq? I see more if conditions and branches which
> > I am trying to avoid.
>
> Yes, keep 'decayed' local and add a:
>
>     if (cfs_rq == &rq_of(cfs_rq)->cfs)
>         cfs_rq->decayed = decayed
>
>
>
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ