[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240506100509.GL40213@noisy.programming.kicks-ass.net>
Date: Mon, 6 May 2024 12:05:09 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Qais Yousef <qyousef@...alina.io>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>,
Viresh Kumar <viresh.kumar@...aro.org>,
Ingo Molnar <mingo@...nel.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Juri Lelli <juri.lelli@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Valentin Schneider <vschneid@...hat.com>,
Christian Loehle <christian.loehle@....com>,
linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] sched: Consolidate cpufreq updates
On Mon, May 06, 2024 at 12:31:03AM +0100, Qais Yousef wrote:
> +static inline void update_cpufreq_ctx_switch(struct rq *rq, struct task_struct *prev)
> +{
> +#ifdef CONFIG_CPU_FREQ
> + unsigned int flags = 0;
> +
> +#ifdef CONFIG_SMP
> + if (unlikely(current->sched_class == &stop_sched_class))
> + return;
> +#endif
why do we care about the stop class? It shouldn't, in general, consume a
lot of cycles.
> +
> + if (unlikely(current->sched_class == &idle_sched_class))
> + return;
And why do we care about idle? Specifically this test doesn't capture
force-idle threads. Notably see is_idle_task().
> +
> + if (unlikely(task_has_idle_policy(current)))
> + return;
> +
> + if (likely(fair_policy(current->policy))) {
> +
> + if (unlikely(current->in_iowait)) {
> + flags |= SCHED_CPUFREQ_IOWAIT | SCHED_CPUFREQ_FORCE_UPDATE;
> + goto force_update;
> + }
> +
> +#ifdef CONFIG_SMP
> + /*
> + * Allow cpufreq updates once for every update_load_avg() decay.
> + */
> + if (unlikely(rq->cfs.decayed)) {
> + rq->cfs.decayed = false;
> + goto force_update;
> + }
> +#endif
> + return;
> + }
> +
> + /*
> + * RT and DL should always send a freq update. But we can do some
> + * simple checks to avoid it when we know it's not necessary.
> + */
> + if (rt_task(current) && rt_task(prev)) {
IIRC dl tasks also match rt_task, so your else clause might not work the
way you've intended.
> +#ifdef CONFIG_UCLAMP_TASK
> + unsigned long curr_uclamp_min = uclamp_eff_value(current, UCLAMP_MIN);
> + unsigned long prev_uclamp_min = uclamp_eff_value(prev, UCLAMP_MIN);
> +
> + if (curr_uclamp_min == prev_uclamp_min)
> +#endif
> + return;
> + } else if (dl_task(current) && current->dl.flags & SCHED_FLAG_SUGOV) {
Notably DL tasks also match rt_task(), so I don't think this clause
exactly does as you expect. Also, isn't the flags check sufficient on
it's own?
> + /* Ignore sugov kthreads, they're responding to our requests */
> + return;
> + }
> +
> + flags |= SCHED_CPUFREQ_FORCE_UPDATE;
> +
> +force_update:
> + cpufreq_update_util(rq, flags);
> +#endif
> +}
But over-all the thing seems very messy, mixing sched_class, policy and
prio based selection methods.
Can't this be cleaned up somewhat?
Notably, if you structure it something like so:
if (fair_policy(current)) {
...
return;
}
if (rt_policy(current)) {
if (dl_task(current) && current->dl.flags & SCHED_FLAG_SUGOV)
return;
if (rt_policy(prev) && uclamps_match(current, prev))
return;
...
return;
}
/* everybody else gets nothing */
return;
You get a lot less branches in the common paths, no?
Powered by blists - more mailing lists