[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtA_L=8DtguZgTddSxSEaVLyvYWKYrrtwMmcuteP7ef9FA@mail.gmail.com>
Date: Tue, 3 Sep 2024 08:54:14 +0200
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Qais Yousef <qyousef@...alina.io>
Cc: Christian Loehle <christian.loehle@....com>, "Rafael J. Wysocki" <rafael@...nel.org>,
Viresh Kumar <viresh.kumar@...aro.org>, Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>, Dietmar Eggemann <dietmar.eggemann@....com>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>, Hongyan Xia <hongyan.xia2@....com>,
John Stultz <jstultz@...gle.com>, linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v7] sched: Consolidate cpufreq updates
On Mon, 2 Sept 2024 at 22:43, Qais Yousef <qyousef@...alina.io> wrote:
>
> On 09/02/24 15:36, Vincent Guittot wrote:
>
> > > If we have 1 CPU per PD, then relaxing affinity will allow it to run anywhere.
> > > I am just this will be safe on all platforms of course.
> > >
> > > But yeah, I don't think this is a solution anyway but the simplest thing to
> > > make it harder to hit.
> > >
> > > > The problem is that the 1st switch to task A will be preempted by
> > > > sugov so the 1st switch is useless. You should call cpufreq_update
> > > > before switching to A so that we skip the useless switch to task A and
> > > > directly switch to sugov 1st then task A
> > >
> > > Can we do this safely after we pick task A, but before we do the actual context
> > > switch? One of the reasons I put this too late is because there's a late call
> > > to balance_calbacks() that can impact the state of the rq and important to take
> > > into account based on my previous testing and analysis.
> >
> > I don't have all cases in mind and it would need more thinking but
> > this should be doable
> >
> > >
> > > Any reason we need to run the sugov worker as DL instead for example being
> > > a softirq?
> >
> > sugov can sleep
>
> Hmm. I thought the biggest worry is about this operation requires synchronous
> operation to talk to hw/fw to change the frequency which can be slow and we
> don't want this to happen from the scheduler fast path with all the critical
> locks held.
>
> If we sleep, then the sugov DL task will experience multiple context switches
> to perform its work. So it is very slow anyway.
A good reason to not add more
>
> IMO refactoring the code so we allow drivers that don't sleep to run from
> softirq context to speed things up and avoid any context switches is the
> sensible optimization to do.
AFAICT, only cpufreq fast_switch is known to be atomic, others can
take a lock and as a result sleep so it's not possible.
Please use fast_switch in this case but not softirq which can end up
in a daemon anyway.
>
> Drivers that sleep will experience significant delays and I'm not seeing the
> point of optimizing an additional context switch. I don't see we need to get
No, They don't have spurious wakeups now, your patch is adding one
more. I don't see why they should accept spurious context switch
> out of our way to accommodate these slow platforms. But give them the option to
> move to something better if they manage to write their code to avoid sleeps.
>
> Would this be a suitable option to move forward?
>
> FWIW I did test this on pixel 6 which !fast_switch and benchmark scores are
> either the same or better (less frame drops in UI particularly).
Powered by blists - more mailing lists