[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0g+yax=pT4m_2MTd9kUwbk5VBp2wkctTYJpFRU3myEjPQ@mail.gmail.com>
Date: Mon, 17 Feb 2025 12:52:44 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Beata Michalska <beata.michalska@....com>
Cc: linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-pm@...r.kernel.org, ionela.voinescu@....com, sudeep.holla@....com,
will@...nel.org, catalin.marinas@....com, rafael@...nel.org,
viresh.kumar@...aro.org, sumitg@...dia.com, yang@...amperecomputing.com,
vanshikonda@...amperecomputing.com, lihuisong@...wei.com,
zhanjie9@...ilicon.com, ptsm@...ux.microsoft.com,
Jonathan Corbet <corbet@....net>, Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
"H . Peter Anvin" <hpa@...or.com>, Phil Auld <pauld@...hat.com>, x86@...nel.org,
linux-doc@...r.kernel.org
Subject: Re: [PATCH v10 2/4] cpufreq: Introduce an optional cpuinfo_avg_freq
sysfs entry
On Fri, Jan 31, 2025 at 5:25 PM Beata Michalska <beata.michalska@....com> wrote:
>
> Currently the CPUFreq core exposes two sysfs attributes that can be used
> to query current frequency of a given CPU(s): namely cpuinfo_cur_freq
> and scaling_cur_freq. Both provide slightly different view on the
> subject and they do come with their own drawbacks.
>
> cpuinfo_cur_freq provides higher precision though at a cost of being
> rather expensive. Moreover, the information retrieved via this attribute
> is somewhat short lived as frequency can change at any point of time
> making it difficult to reason from.
>
> scaling_cur_freq, on the other hand, tends to be less accurate but then
> the actual level of precision (and source of information) varies between
> architectures making it a bit ambiguous.
>
> The new attribute, cpuinfo_avg_freq, is intended to provide more stable,
> distinct interface, exposing an average frequency of a given CPU(s), as
> reported by the hardware, over a time frame spanning no more than a few
> milliseconds. As it requires appropriate hardware support, this
> interface is optional.
>
> Note that under the hood, the new attribute relies on the information
> provided by arch_freq_get_on_cpu, which, up to this point, has been
> feeding data for scaling_cur_freq attribute, being the source of
> ambiguity when it comes to interpretation. This has been amended by
> restoring the intended behavior for scaling_cur_freq, with a new
> dedicated config option to maintain status quo for those, who may need
> it.
In case anyone is waiting for my input here
Acked-by: Rafael J. Wysocki <rafael@...nel.org>
for this and the previous patch and please feel free to route them
both through ARM64.
Thanks!
> CC: Jonathan Corbet <corbet@....net>
> CC: Thomas Gleixner <tglx@...utronix.de>
> CC: Ingo Molnar <mingo@...hat.com>
> CC: Borislav Petkov <bp@...en8.de>
> CC: Dave Hansen <dave.hansen@...ux.intel.com>
> CC: H. Peter Anvin <hpa@...or.com>
> CC: Phil Auld <pauld@...hat.com>
> CC: x86@...nel.org
> CC: linux-doc@...r.kernel.org
>
> Signed-off-by: Beata Michalska <beata.michalska@....com>
> Reviewed-by: Prasanna Kumar T S M <ptsm@...ux.microsoft.com>
> Reviewed-by: Sumit Gupta <sumitg@...dia.com>
> ---
> Documentation/admin-guide/pm/cpufreq.rst | 17 +++++++++++++-
> drivers/cpufreq/Kconfig.x86 | 12 ++++++++++
> drivers/cpufreq/cpufreq.c | 30 +++++++++++++++++++++++-
> 3 files changed, 57 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst
> index a21369eba034..3950583f2b15 100644
> --- a/Documentation/admin-guide/pm/cpufreq.rst
> +++ b/Documentation/admin-guide/pm/cpufreq.rst
> @@ -248,6 +248,20 @@ are the following:
> If that frequency cannot be determined, this attribute should not
> be present.
>
> +``cpuinfo_avg_freq``
> + An average frequency (in KHz) of all CPUs belonging to a given policy,
> + derived from a hardware provided feedback and reported on a time frame
> + spanning at most few milliseconds.
> +
> + This is expected to be based on the frequency the hardware actually runs
> + at and, as such, might require specialised hardware support (such as AMU
> + extension on ARM). If one cannot be determined, this attribute should
> + not be present.
> +
> + Note, that failed attempt to retrieve current frequency for a given
> + CPU(s) will result in an appropriate error, i.e: EAGAIN for CPU that
> + remains idle (raised on ARM).
> +
> ``cpuinfo_max_freq``
> Maximum possible operating frequency the CPUs belonging to this policy
> can run at (in kHz).
> @@ -293,7 +307,8 @@ are the following:
> Some architectures (e.g. ``x86``) may attempt to provide information
> more precisely reflecting the current CPU frequency through this
> attribute, but that still may not be the exact current CPU frequency as
> - seen by the hardware at the moment.
> + seen by the hardware at the moment. This behavior though, is only
> + available via c:macro:``CPUFREQ_ARCH_CUR_FREQ`` option.
>
> ``scaling_driver``
> The scaling driver currently in use.
> diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86
> index 97c2d4f15d76..2c5c228408bf 100644
> --- a/drivers/cpufreq/Kconfig.x86
> +++ b/drivers/cpufreq/Kconfig.x86
> @@ -340,3 +340,15 @@ config X86_SPEEDSTEP_RELAXED_CAP_CHECK
> option lets the probing code bypass some of those checks if the
> parameter "relaxed_check=1" is passed to the module.
>
> +config CPUFREQ_ARCH_CUR_FREQ
> + default y
> + bool "Current frequency derived from HW provided feedback"
> + help
> + This determines whether the scaling_cur_freq sysfs attribute returns
> + the last requested frequency or a more precise value based on hardware
> + provided feedback (as architected counters).
> + Given that a more precise frequency can now be provided via the
> + cpuinfo_avg_freq attribute, by enabling this option,
> + scaling_cur_freq maintains the provision of a counter based frequency,
> + for compatibility reasons.
> +
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 96b013ea177c..a2f31fbb1774 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -734,12 +734,20 @@ __weak int arch_freq_get_on_cpu(int cpu)
> return -EOPNOTSUPP;
> }
>
> +static inline bool cpufreq_avg_freq_supported(struct cpufreq_policy *policy)
> +{
> + return arch_freq_get_on_cpu(policy->cpu) != -EOPNOTSUPP;
> +}
> +
> static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf)
> {
> ssize_t ret;
> int freq;
>
> - freq = arch_freq_get_on_cpu(policy->cpu);
> + freq = IS_ENABLED(CONFIG_CPUFREQ_ARCH_CUR_FREQ)
> + ? arch_freq_get_on_cpu(policy->cpu)
> + : 0;
> +
> if (freq > 0)
> ret = sysfs_emit(buf, "%u\n", freq);
> else if (cpufreq_driver->setpolicy && cpufreq_driver->get)
> @@ -784,6 +792,19 @@ static ssize_t show_cpuinfo_cur_freq(struct cpufreq_policy *policy,
> return sysfs_emit(buf, "<unknown>\n");
> }
>
> +/*
> + * show_cpuinfo_avg_freq - average CPU frequency as detected by hardware
> + */
> +static ssize_t show_cpuinfo_avg_freq(struct cpufreq_policy *policy,
> + char *buf)
> +{
> + int avg_freq = arch_freq_get_on_cpu(policy->cpu);
> +
> + if (avg_freq > 0)
> + return sysfs_emit(buf, "%u\n", avg_freq);
> + return avg_freq != 0 ? avg_freq : -EINVAL;
> +}
> +
> /*
> * show_scaling_governor - show the current policy for the specified CPU
> */
> @@ -946,6 +967,7 @@ static ssize_t show_bios_limit(struct cpufreq_policy *policy, char *buf)
> }
>
> cpufreq_freq_attr_ro_perm(cpuinfo_cur_freq, 0400);
> +cpufreq_freq_attr_ro(cpuinfo_avg_freq);
> cpufreq_freq_attr_ro(cpuinfo_min_freq);
> cpufreq_freq_attr_ro(cpuinfo_max_freq);
> cpufreq_freq_attr_ro(cpuinfo_transition_latency);
> @@ -1073,6 +1095,12 @@ static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
> return ret;
> }
>
> + if (cpufreq_avg_freq_supported(policy)) {
> + ret = sysfs_create_file(&policy->kobj, &cpuinfo_avg_freq.attr);
> + if (ret)
> + return ret;
> + }
> +
> ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr);
> if (ret)
> return ret;
> --
> 2.25.1
>
>
Powered by blists - more mailing lists