[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240902204301.v55w7id3d2kw64qy@airbuntu>
Date: Mon, 2 Sep 2024 21:43:01 +0100
From: Qais Yousef <qyousef@...alina.io>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Christian Loehle <christian.loehle@....com>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Viresh Kumar <viresh.kumar@...aro.org>,
Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>,
Hongyan Xia <hongyan.xia2@....com>,
John Stultz <jstultz@...gle.com>, linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v7] sched: Consolidate cpufreq updates
On 09/02/24 15:36, Vincent Guittot wrote:
> > If we have 1 CPU per PD, then relaxing affinity will allow it to run anywhere.
> > I am just this will be safe on all platforms of course.
> >
> > But yeah, I don't think this is a solution anyway but the simplest thing to
> > make it harder to hit.
> >
> > > The problem is that the 1st switch to task A will be preempted by
> > > sugov so the 1st switch is useless. You should call cpufreq_update
> > > before switching to A so that we skip the useless switch to task A and
> > > directly switch to sugov 1st then task A
> >
> > Can we do this safely after we pick task A, but before we do the actual context
> > switch? One of the reasons I put this too late is because there's a late call
> > to balance_calbacks() that can impact the state of the rq and important to take
> > into account based on my previous testing and analysis.
>
> I don't have all cases in mind and it would need more thinking but
> this should be doable
>
> >
> > Any reason we need to run the sugov worker as DL instead for example being
> > a softirq?
>
> sugov can sleep
Hmm. I thought the biggest worry is about this operation requires synchronous
operation to talk to hw/fw to change the frequency which can be slow and we
don't want this to happen from the scheduler fast path with all the critical
locks held.
If we sleep, then the sugov DL task will experience multiple context switches
to perform its work. So it is very slow anyway.
IMO refactoring the code so we allow drivers that don't sleep to run from
softirq context to speed things up and avoid any context switches is the
sensible optimization to do.
Drivers that sleep will experience significant delays and I'm not seeing the
point of optimizing an additional context switch. I don't see we need to get
out of our way to accommodate these slow platforms. But give them the option to
move to something better if they manage to write their code to avoid sleeps.
Would this be a suitable option to move forward?
FWIW I did test this on pixel 6 which !fast_switch and benchmark scores are
either the same or better (less frame drops in UI particularly).
Powered by blists - more mailing lists