[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0j-o=03hWrSkk2nx9uWctKaRSJmRNXY6d=e0b46_+fNzA@mail.gmail.com>
Date: Wed, 11 Dec 2024 18:55:09 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, Christian Loehle <christian.loehle@....com>,
"Rafael J. Wysocki" <rjw@...ysocki.net>, Linux PM <linux-pm@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>, Lukasz Luba <lukasz.luba@....com>,
Peter Zijlstra <peterz@...radead.org>,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
Dietmar Eggemann <dietmar.eggemann@....com>, Morten Rasmussen <morten.rasmussen@....com>,
Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>,
Pierre Gondois <pierre.gondois@....com>
Subject: Re: [RFC][PATCH v021 4/9] sched/topology: Adjust cpufreq checks for EAS
On Wed, Dec 11, 2024 at 6:08 PM Vincent Guittot
<vincent.guittot@...aro.org> wrote:
>
> On Wed, 11 Dec 2024 at 17:38, Rafael J. Wysocki <rafael@...nel.org> wrote:
> >
> > On Wed, Dec 11, 2024 at 2:25 PM Vincent Guittot
> > <vincent.guittot@...aro.org> wrote:
> > >
> > > On Wed, 11 Dec 2024 at 12:29, Rafael J. Wysocki <rafael@...nel.org> wrote:
> > > >
> > > > On Wed, Dec 11, 2024 at 11:33 AM Christian Loehle
> > > > <christian.loehle@....com> wrote:
> > > > >
> > > > > On 11/29/24 16:00, Rafael J. Wysocki wrote:
> > > > > > From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> > > > > >
> > > > > > Make it possible to use EAS with cpufreq drivers that implement the
> > > > > > :setpolicy() callback instead of using generic cpufreq governors.
> > > > > >
> > > > > > This is going to be necessary for using EAS with intel_pstate in its
> > > > > > default configuration.
> > > > > >
> > > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> > > > > > ---
> > > > > >
> > > > > > This is the minimum of what's needed, but I'd really prefer to move
> > > > > > the cpufreq vs EAS checks into cpufreq because messing around cpufreq
> > > > > > internals in topology.c feels like a butcher shop kind of exercise.
> > > > >
> > > > > Makes sense, something like cpufreq_eas_capable().
> > > > >
> > > > > >
> > > > > > Besides, as I said before, I remain unconvinced about the usefulness
> > > > > > of these checks at all. Yes, one is supposed to get the best results
> > > > > > from EAS when running schedutil, but what if they just want to try
> > > > > > something else with EAS? What if they can get better results with
> > > > > > that other thing, surprisingly enough?
> > > > >
> > > > > How do you imagine this to work then?
> > > > > I assume we don't make any 'resulting-OPP-guesses' like
> > > > > sugov_effective_cpu_perf() for any of the setpolicy governors.
> > > > > Neither for dbs and I guess userspace.
> > > > > What about standard powersave and performance?
> > > > > Do we just have a cpufreq callback to ask which OPP to use for
> > > > > the energy calculation? Assume lowest/highest?
> > > > > (I don't think there is hardware where lowest/highest makes a
> > > > > difference, so maybe not bothering with the complexity could
> > > > > be an option, too.)
> > > >
> > > > In the "setpolicy" case there is no way to reliably predict the OPP
> > > > that is going to be used, so why bother?
> > > >
> > > > In the other cases, and if the OPPs are actually known, EAS may still
> > > > make assumptions regarding which of them will be used that will match
> > > > the schedutil selection rules, but if the cpufreq governor happens to
> > > > choose a different OPP, this is not the end of the world.
> > >
> > > Should we add a new cpufreq governor fops to return the guest estimate
> > > of the compute capacity selection ? something like
> > > cpufreq_effective_cpu_perf(cpu, actual, min, max)
> > > EAS needs to estimate what would be the next OPP; schedutil uses
> > > sugov_effective_cpu_perf() and other governor could provide their own
> >
> > Generally, yes. And documented for that matter.
> >
> > But it doesn't really tell you the OPP, but the performance level that
> > is going to be set for the given list of arguments IIUC. An energy
>
> Yes, the governor return what performance level it will select and asl
> to the cpufreq driver so EAS can directly map it to an OPP and a cost
>
> > model is needed to find an OPP for the given perf level. Or generally
> > the cost of it for that matter.
> >
> > > > Yes, you could have been more energy-efficient had you chosen to use
> > > > schedutil, but you chose otherwise and that's what you get.
> > >
> > > Calling sugov_effective_cpu_perf() for another governor than schedutil
> > > doesn't make sense.
> >
> > It will work for intel_pstate in the "setpolicy" mode to a reasonable
> > approximation AFAICS.
> >
> > > and do we handle the case when
> > > CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not selected
> >
> > I don't think it's necessary to handle it.
>
> I don't think that CI and others will be happy to possibly get an
> undeclared function. Or you put a dependency of other governors with
> CPU_FREQ_DEFAULT_GOV_SCHEDUTIL
Do you mean CONFIG_CPU_FREQ_GOV_SCHEDUTIL? Because
CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is only about whether or not schedutil
is the default governor.
I think that it is fine to require CONFIG_CPU_FREQ_GOV_SCHEDUTIL for EAS.
Powered by blists - more mailing lists