[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0i5Xrk6oTt81aeXDi1F8gnEspJo9e6nGf10nSvBz-Dbkw@mail.gmail.com>
Date: Mon, 27 Jul 2020 15:48:39 +0200
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Ionela Voinescu <ionela.voinescu@....com>
Cc: "Rafael J. Wysocki" <rjw@...ysocki.net>,
Viresh Kumar <viresh.kumar@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Catalin Marinas <catalin.marinas@....com>,
Sudeep Holla <sudeep.holla@....com>,
Will Deacon <will@...nel.org>,
Russell King - ARM Linux <linux@...linux.org.uk>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Linux PM <linux-pm@...r.kernel.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Valentin Schneider <valentin.schneider@....com>
Subject: Re: [PATCH v2 1/7] cpufreq: move invariance setter calls in cpufreq core
On Wed, Jul 22, 2020 at 11:38 AM Ionela Voinescu
<ionela.voinescu@....com> wrote:
>
> From: Valentin Schneider <valentin.schneider@....com>
>
> To properly scale its per-entity load-tracking signals, the task scheduler
> needs to be given a frequency scale factor, i.e. some image of the current
> frequency the CPU is running at. Currently, this scale can be computed
> either by using counters (APERF/MPERF on x86, AMU on arm64), or by
> piggy-backing on the frequency selection done by cpufreq.
>
> For the latter, drivers have to explicitly set the scale factor
> themselves, despite it being purely boiler-plate code: the required
> information depends entirely on the kind of frequency switch callback
> implemented by the driver, i.e. either of: target_index(), target(),
> fast_switch() and setpolicy().
>
> The fitness of those callbacks with regard to driving the Frequency
> Invariance Engine (FIE) is studied below:
>
> target_index()
> ==============
> Documentation states that the chosen frequency "must be determined by
> freq_table[index].frequency". It isn't clear if it *has* to be that
> frequency, or if it can use that frequency value to do some computation
> that ultimately leads to a different frequency selection. All drivers
> go for the former, while the vexpress-spc-cpufreq has an atypical
> implementation which is handled separately.
>
> Therefore, the hook works on the assumption the core can use
> freq_table[index].frequency.
>
> target()
> =======
> This has been flagged as deprecated since:
>
> commit 9c0ebcf78fde ("cpufreq: Implement light weight ->target_index() routine")
>
> It also doesn't have that many users:
>
> cpufreq-nforce2.c:371:2: .target = nforce2_target,
> cppc_cpufreq.c:416:2: .target = cppc_cpufreq_set_target,
> gx-suspmod.c:439:2: .target = cpufreq_gx_target,
> pcc-cpufreq.c:573:2: .target = pcc_cpufreq_target,
Also intel_pstate in the passive mode.
>
> Should we care about drivers using this hook, we may be able to exploit
> cpufreq_freq_transition_{being, end}(). This is handled in a separate
> patch.
>
> fast_switch()
> =============
> This callback *has* to return the frequency that was selected.
>
> setpolicy()
> ===========
> This callback does not have any designated way of informing what was the
> end choice. But there are only two drivers using setpolicy(), and none
> of them have current FIE support:
>
> drivers/cpufreq/longrun.c:281: .setpolicy = longrun_set_policy,
> drivers/cpufreq/intel_pstate.c:2215: .setpolicy = intel_pstate_set_policy,
>
> The intel_pstate is known to use counter-driven frequency invariance.
>
> Conclusion
> ==========
>
> Given that the significant majority of current FIE enabled drivers use
> callbacks that lend themselves to triggering the setting of the FIE scale
> factor in a generic way, move the invariance setter calls to cpufreq core.
>
> As a result of setting the frequency scale factor in cpufreq core, after
> callbacks that lend themselves to trigger it, remove this functionality
> from the driver side.
>
> To be noted that despite marking a successful frequency change, many
> cpufreq drivers will consider the new frequency as the requested
> frequency, although this is might not be the one granted by the hardware.
>
> Therefore, the call to arch_set_freq_scale() is a "best effort" one, and
> it is up to the architecture if the new frequency is used in the new
> frequency scale factor setting (determined by the implementation of
> arch_set_freq_scale()) or eventually used by the scheduler (determined
> by the implementation of arch_scale_freq_capacity()). The architecture
> is in a better position to decide if it has better methods to obtain
> more accurate information regarding the current frequency and use that
> information instead (for example, the use of counters).
>
> Signed-off-by: Valentin Schneider <valentin.schneider@....com>
> Signed-off-by: Ionela Voinescu <ionela.voinescu@....com>
> Cc: Rafael J. Wysocki <rjw@...ysocki.net>
> Cc: Viresh Kumar <viresh.kumar@...aro.org>
> ---
> drivers/cpufreq/cpufreq-dt.c | 10 +---------
> drivers/cpufreq/cpufreq.c | 20 +++++++++++++++++---
> drivers/cpufreq/qcom-cpufreq-hw.c | 9 +--------
> drivers/cpufreq/scmi-cpufreq.c | 12 ++----------
> drivers/cpufreq/scpi-cpufreq.c | 6 +-----
> drivers/cpufreq/vexpress-spc-cpufreq.c | 5 -----
> 6 files changed, 22 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
> index 944d7b45afe9..9fd4ce774f12 100644
> --- a/drivers/cpufreq/cpufreq-dt.c
> +++ b/drivers/cpufreq/cpufreq-dt.c
> @@ -40,16 +40,8 @@ static int set_target(struct cpufreq_policy *policy, unsigned int index)
> {
> struct private_data *priv = policy->driver_data;
> unsigned long freq = policy->freq_table[index].frequency;
> - int ret;
> -
> - ret = dev_pm_opp_set_rate(priv->cpu_dev, freq * 1000);
>
> - if (!ret) {
> - arch_set_freq_scale(policy->related_cpus, freq,
> - policy->cpuinfo.max_freq);
> - }
> -
> - return ret;
> + return dev_pm_opp_set_rate(priv->cpu_dev, freq * 1000);
> }
>
> /*
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 036f4cc42ede..bac4101546db 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -2058,9 +2058,16 @@ EXPORT_SYMBOL(cpufreq_unregister_notifier);
> unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy,
> unsigned int target_freq)
> {
> + unsigned int freq;
> +
> target_freq = clamp_val(target_freq, policy->min, policy->max);
> + freq = cpufreq_driver->fast_switch(policy, target_freq);
> +
> + if (freq)
> + arch_set_freq_scale(policy->related_cpus, freq,
> + policy->cpuinfo.max_freq);
Why can't arch_set_freq_scale() handle freq == 0?
Powered by blists - more mailing lists