lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240507005659.d4rzzaoq3isanndf@airbuntu>
Date: Tue, 7 May 2024 01:56:59 +0100
From: Qais Yousef <qyousef@...alina.io>
To: Peter Zijlstra <peterz@...radead.org>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>,
	Viresh Kumar <viresh.kumar@...aro.org>,
	Ingo Molnar <mingo@...nel.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Juri Lelli <juri.lelli@...hat.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Daniel Bristot de Oliveira <bristot@...hat.com>,
	Valentin Schneider <vschneid@...hat.com>,
	Christian Loehle <christian.loehle@....com>,
	linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] sched: Consolidate cpufreq updates

On 05/06/24 12:05, Peter Zijlstra wrote:
> On Mon, May 06, 2024 at 12:31:03AM +0100, Qais Yousef wrote:
> 
> > +static inline void update_cpufreq_ctx_switch(struct rq *rq, struct task_struct *prev)
> > +{
> > +#ifdef CONFIG_CPU_FREQ
> > +	unsigned int flags = 0;
> > +
> > +#ifdef CONFIG_SMP
> > +	if (unlikely(current->sched_class == &stop_sched_class))
> > +		return;
> > +#endif
> 
> why do we care about the stop class? It shouldn't, in general, consume a
> lot of cycles.
> 
> > +
> > +	if (unlikely(current->sched_class == &idle_sched_class))
> > +		return;
> 
> And why do we care about idle? Specifically this test doesn't capture
> force-idle threads. Notably see is_idle_task().

It's just We don't want these tasks to 'pollute' cpufreq updates since they
shouldn't care or contribute to what frequency the CPU should be running at.

Yes I missed the is_idle_task() from the exclusion list - which can be
simplified as you suggest later.

> 
> > +
> > +	if (unlikely(task_has_idle_policy(current)))
> > +		return;
> > +
> > +	if (likely(fair_policy(current->policy))) {
> > +
> > +		if (unlikely(current->in_iowait)) {
> > +			flags |= SCHED_CPUFREQ_IOWAIT | SCHED_CPUFREQ_FORCE_UPDATE;
> > +			goto force_update;
> > +		}
> > +
> > +#ifdef CONFIG_SMP
> > +		/*
> > +		 * Allow cpufreq updates once for every update_load_avg() decay.
> > +		 */
> > +		if (unlikely(rq->cfs.decayed)) {
> > +			rq->cfs.decayed = false;
> > +			goto force_update;
> > +		}
> > +#endif
> > +		return;
> > +	}
> > +
> > +	/*
> > +	 * RT and DL should always send a freq update. But we can do some
> > +	 * simple checks to avoid it when we know it's not necessary.
> > +	 */
> > +	if (rt_task(current) && rt_task(prev)) {
> 
> IIRC dl tasks also match rt_task, so your else clause might not work the
> way you've intended.
> 
> > +#ifdef CONFIG_UCLAMP_TASK
> > +		unsigned long curr_uclamp_min = uclamp_eff_value(current, UCLAMP_MIN);
> > +		unsigned long prev_uclamp_min = uclamp_eff_value(prev, UCLAMP_MIN);
> > +
> > +		if (curr_uclamp_min == prev_uclamp_min)
> > +#endif
> > +			return;
> > +	} else if (dl_task(current) && current->dl.flags & SCHED_FLAG_SUGOV) {
> 
> Notably DL tasks also match rt_task(), so I don't think this clause

Hmm yes. dl priority is negative and rt_task() will capture this. Shouldn't we
fix the function? Can send a separate patch.

	static inline int rt_task(struct task_struct *p)
	{
		return rt_prio(p->prio) && !dl_prio();
	}

> exactly does as you expect. Also, isn't the flags check sufficient on
> it's own?

I considered this, but opted to keep the dl_task() reservedly assuming access
to dl structure should only be considered valid for dl tasks. It seemed safer
to me against potential future changes to the access pattern.

Happy to drop it if this is too reserved.

> 
> > +		/* Ignore sugov kthreads, they're responding to our requests */
> > +		return;
> > +	}
> > +
> > +	flags |= SCHED_CPUFREQ_FORCE_UPDATE;
> > +
> > +force_update:
> > +	cpufreq_update_util(rq, flags);
> > +#endif
> > +}
> 
> But over-all the thing seems very messy, mixing sched_class, policy and
> prio based selection methods.

Yeah, I started with basic conditions and started walking my way on what things
should be excluded. We don't have fair_task() so used fair_policy() and out of
habit continued to use rt/dl_task() without realizing the caveat you
highlighted.

> 
> Can't this be cleaned up somewhat?
> 
> 
> Notably, if you structure it something like so:
> 
> 	if (fair_policy(current)) {
> 		...
> 		return;
> 	}
> 
> 	if (rt_policy(current)) {
> 		if (dl_task(current) && current->dl.flags & SCHED_FLAG_SUGOV)
> 			return;
> 		if (rt_policy(prev) && uclamps_match(current, prev))
> 			return;
> 		...
> 		return;
> 	}
> 
> 	/* everybody else gets nothing */
> 	return;
> 
> You get a lot less branches in the common paths, no?

Yes. How about this? Since stopper class appears as RT, we should still check
for this class specifically.

Thanks!

--->8---

static inline void update_cpufreq_ctx_switch(struct rq *rq, struct task_struct *prev)
{
#ifdef CONFIG_CPU_FREQ
	if (likely(fair_policy(current->policy))) {

		if (unlikely(current->in_iowait)) {
			cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT | SCHED_CPUFREQ_FORCE_UPDATE);
			return;
		}

#ifdef CONFIG_SMP
		/*
		 * Allow cpufreq updates once for every update_load_avg() decay.
		 */
		if (unlikely(rq->cfs.decayed)) {
			rq->cfs.decayed = false;
			cpufreq_update_util(rq, 0);
			return;
		}
#endif
		return;
	}

	/*
	 * RT and DL should always send a freq update. But we can do some
	 * simple checks to avoid it when we know it's not necessary.
	 */
	if (task_is_realtime(current)) {
		if (dl_task(current) && current->dl.flags & SCHED_FLAG_SUGOV) {
			/* Ignore sugov kthreads, they're responding to our requests */
			return;
		}

		if (rt_task(current) && rt_task(prev)) {
#ifdef CONFIG_UCLAMP_TASK
			unsigned long curr_uclamp_min = uclamp_eff_value(current, UCLAMP_MIN);
			unsigned long prev_uclamp_min = uclamp_eff_value(prev, UCLAMP_MIN);

			if (curr_uclamp_min == prev_uclamp_min)
#endif
				return;
		}

#ifdef CONFIG_SMP
		if (unlikely(current->sched_class == &stop_sched_class))
			return;
#endif

		cpufreq_update_util(rq, SCHED_CPUFREQ_FORCE_UPDATE);
		return;
	}

	/* Everything else shouldn't trigger a cpufreq update */
	return;
#endif
}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ