lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240902204301.v55w7id3d2kw64qy@airbuntu>
Date: Mon, 2 Sep 2024 21:43:01 +0100
From: Qais Yousef <qyousef@...alina.io>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Christian Loehle <christian.loehle@....com>,
	"Rafael J. Wysocki" <rafael@...nel.org>,
	Viresh Kumar <viresh.kumar@...aro.org>,
	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Juri Lelli <juri.lelli@...hat.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Hongyan Xia <hongyan.xia2@....com>,
	John Stultz <jstultz@...gle.com>, linux-pm@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v7] sched: Consolidate cpufreq updates

On 09/02/24 15:36, Vincent Guittot wrote:

> > If we have 1 CPU per PD, then relaxing affinity will allow it to run anywhere.
> > I am just this will be safe on all platforms of course.
> >
> > But yeah, I don't think this is a solution anyway but the simplest thing to
> > make it harder to hit.
> >
> > > The problem is that the 1st switch to task A will be preempted by
> > > sugov so the 1st switch is useless. You should call cpufreq_update
> > > before switching to A so that we skip the useless switch to task A and
> > > directly switch to sugov 1st then task A
> >
> > Can we do this safely after we pick task A, but before we do the actual context
> > switch? One of the reasons I put this too late is because there's a late call
> > to balance_calbacks() that can impact the state of the rq and important to take
> > into account based on my previous testing and analysis.
> 
> I don't have all cases in mind and it would need more thinking but
> this should be doable
> 
> >
> > Any reason we need to run the sugov worker as DL instead for example being
> > a softirq?
> 
> sugov can sleep

Hmm. I thought the biggest worry is about this operation requires synchronous
operation to talk to hw/fw to change the frequency which can be slow and we
don't want this to happen from the scheduler fast path with all the critical
locks held.

If we sleep, then the sugov DL task will experience multiple context switches
to perform its work. So it is very slow anyway.

IMO refactoring the code so we allow drivers that don't sleep to run from
softirq context to speed things up and avoid any context switches is the
sensible optimization to do.

Drivers that sleep will experience significant delays and I'm not seeing the
point of optimizing an additional context switch. I don't see we need to get
out of our way to accommodate these slow platforms. But give them the option to
move to something better if they manage to write their code to avoid sleeps.

Would this be a suitable option to move forward?

FWIW I did test this on pixel 6 which !fast_switch and benchmark scores are
either the same or better (less frame drops in UI particularly).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ