lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJZ5v0jCT5exCOz1gmHN+gXaamn-W0Yg0g8KN77vB5tUmsGFOw@mail.gmail.com>
Date: Thu, 5 Feb 2026 20:27:29 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Sumit Gupta <sumitg@...dia.com>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, Mario Limonciello <mario.limonciello@....com>, 
	Russell Haley <yumpusamongus@...il.com>, "zhenglifeng (A)" <zhenglifeng1@...wei.com>, 
	pierre.gondois@....com, viresh.kumar@...aro.org, ionela.voinescu@....com, 
	corbet@....net, rdunlap@...radead.org, ray.huang@....com, 
	gautham.shenoy@....com, perry.yuan@....com, zhanjie9@...ilicon.com, 
	linux-pm@...r.kernel.org, linux-acpi@...r.kernel.org, 
	linux-doc@...r.kernel.org, acpica-devel@...ts.linux.dev, 
	linux-kernel@...r.kernel.org, linux-tegra@...r.kernel.org, treding@...dia.com, 
	jonathanh@...dia.com, vsethi@...dia.com, ksitaraman@...dia.com, 
	sanjayc@...dia.com, nhartman@...dia.com, bbasu@...dia.com
Subject: Re: [PATCH v7 4/7] ACPI: CPPC: add APIs and sysfs interface for min/max_perf

On Thu, Feb 5, 2026 at 8:21 PM Sumit Gupta <sumitg@...dia.com> wrote:
>
> >>>>>>>>>>> Hi Sumit,
> >>>>>>>>>>>
> >>>>>>>>>>> I am thinking that maybe it is better to call these two sysfs
> >>>>>>>>>>> interface
> >>>>>>>>>>> 'min_freq' and 'max_freq' as users read and write khz instead
> >>>>>>>>>>> of raw
> >>>>>>>>>>> value.
> >>>>>>>>>> Thanks for the suggestion.
> >>>>>>>>>> Kept min_perf/max_perf to match the CPPC register names
> >>>>>>>>>> (MIN_PERF/MAX_PERF), making it clear to users familiar with
> >>>>>>>>>> CPPC what's being controlled.
> >>>>>>>>>> The kHz unit is documented in the ABI.
> >>>>>>>>>>
> >>>>>>>>>> Thank you,
> >>>>>>>>>> Sumit Gupta
> >>>>>>>>> On my x86 machine with kernel 6.18.5, the kernel is exposing raw
> >>>>>>>>> values:
> >>>>>>>>>
> >>>>>>>>>> grep . /sys/devices/system/cpu/cpu0/acpi_cppc/*
> >>>>>>>>> /sys/devices/system/cpu/cpu0/acpi_cppc/feedback_ctrs:ref:342904018856568
> >>>>>>>>>
> >>>>>>>>> del:437439724183386
> >>>>>>>>> /sys/devices/system/cpu/cpu0/acpi_cppc/guaranteed_perf:63
> >>>>>>>>> /sys/devices/system/cpu/cpu0/acpi_cppc/highest_perf:88
> >>>>>>>>> /sys/devices/system/cpu/cpu0/acpi_cppc/lowest_freq:0
> >>>>>>>>> /sys/devices/system/cpu/cpu0/acpi_cppc/lowest_nonlinear_perf:36
> >>>>>>>>> /sys/devices/system/cpu/cpu0/acpi_cppc/lowest_perf:1
> >>>>>>>>> /sys/devices/system/cpu/cpu0/acpi_cppc/nominal_freq:3900
> >>>>>>>>> /sys/devices/system/cpu/cpu0/acpi_cppc/nominal_perf:62
> >>>>>>>>> /sys/devices/system/cpu/cpu0/acpi_cppc/reference_perf:62
> >>>>>>>>> /sys/devices/system/cpu/cpu0/acpi_cppc/wraparound_time:18446744073709551615
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> It would be surprising for a nearby sysfs interface with very
> >>>>>>>>> similar
> >>>>>>>>> names to use kHz instead.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>>
> >>>>>>>>> Russell Haley
> >>>>>>>> I can rename to either of the below:
> >>>>>>>> - min/max_freq: might be confused with scaling_min/max_freq.
> >>>>>>>> - min/max_perf_freq: keeps the CPPC register association clear.
> >>>>>>>>
> >>>>>>>> Rafael, Any preferences here?
> >>>>>>> On x86 the units in CPPC are not kHz and there is no easy reliable
> >>>>>>> way
> >>>>>>> to convert them to kHz.
> >>>>>>>
> >>>>>>> Everything under /sys/devices/system/cpu/cpu0/acpi_cppc/ needs to be
> >>>>>>> in CPPC units, not kHz (unless, of course, kHz are CPPC units).
> >>>>>
> >>>>> In v1 [1], these controls were added under acpi_cppc sysfs.
> >>>>> After discussion, they were moved under cpufreq, and [2] was merged
> >>>>> first.
> >>>>> The decision to use frequency scale instead of raw perf was made
> >>>>> for consistency with other cpufreq interfaces as per (v3 [3]).
> >>>>>
> >>>>> CPPC units in our case are also not in kHz. The kHz conversion uses the
> >>>>> existing cppc_perf_to_khz()/cppc_khz_to_perf() helpers which are
> >>>>> already
> >>>>> used in cppc_cpufreq attributes. So the conversion behavior is
> >>>>> consistent
> >>>>> with existing cpufreq interfaces.
> >>>>>
> >>>>> [1]
> >>>>> https://lore.kernel.org/lkml/076c199c-a081-4a7f-956c-f395f4d5e156@nvidia.com/
> >>>>>
> >>>>> [2]
> >>>>> https://lore.kernel.org/all/20250507031941.2812701-1-zhenglifeng1@huawei.com/
> >>>>>
> >>>>> [3]
> >>>>> https://lore.kernel.org/lkml/80e16de0-63e4-4ead-9577-4ebba9b1a02d@nvidia.com/
> >>>>>
> >>>>>
> >>>>>> That said, the new attributes will show up elsewhere.
> >>>>>>
> >>>>>> So why do you need to add these things in the first place?
> >>>>> Currently there's no sysfs interface to dynamically control the
> >>>>> MIN_PERF/MAX_PERF bounds when using autonomous mode. This helps
> >>>>> users tune power and performance at runtime.
> >>>> So what about scaling_min_freq and scaling_max_freq?
> >>>>
> >>>> intel_pstate uses them for an analogous purpose.
> >>> FWIW same thing for amd_pstate.
> >>>
> >> intel_pstate and amd_pstate seem to use setpolicy() to update
> >> scaling_min/max_freq and program MIN_PERF/MAX_PERF.
> > That's one possibility.
> >
> > intel_pstate has a "cpufreq-compatible" mode (in which case it is
> > called intel_cpufreq) and still uses HWP (which is the underlying
> > mechanism for CPPC on Intel platforms).
> >
> >> However, as discussed in v5 [1], cppc_cpufreq cannot switch to
> >> a setpolicy based approach because:
> >> - We need per-CPU control of auto_sel: With setpolicy, we can't
> >>     dynamically disable auto_sel for individual CPUs and return to the
> >>     target() (no target hook available).
> >>     intel_pstate and amd_pstate seem to set HW autonomous mode for
> >>     all CPUs, not per-CPU.
> >> - We need to retain the target() callback - the CPPC spec allows
> >>     desired_perf to be used even when autonomous selection is enabled.
> > intel_pstate in the "cpufreq-compatible" mode updates its HWP min and
> > max limits when .target() (or .fast_switch() or .adjust_perf()) is
> > called.
> >
> > I guess that would not be sufficient in cppc_cpufreq for some reason?
> >
> >> [1]
> >> https://lore.kernel.org/lkml/66f58f43-631b-40a0-8d42-4e90cd24b757@arm.com/
>
> We can do the same as intel_cpufreq. CPPC spec allows setting
> MIN_PERF/MAX_PERF even when auto_selection is disabled, so we will
> have to update them always from policy limits in target().
>
> However, this would override BIOS-configured MIN_PERF/MAX_PERF values.
> Since policy->min/max are set from hardware capabilities during init,
> any governor would overwrite BIOS bounds with policy limits (hardware
> capability bounds) on their first frequency request - even when user
> hasn't explicitly changed scaling_min/max_freq.
>
> Does intel_cpufreq also override BIOS-configured HWP min/max values?

Yes, it does.

> Should we preserve BIOS-configured values until user explicitly changes
> scaling_min/max_freq?

Why would that be useful?

> Is there any mechanism in cpufreq core to detect explicit user changes to scaling_min/max_freq?

Not today, but since scaling_min/max_freq have their own freq QoS
requests, it should be doable if need be.

In any case, I would very much prefer using the existing
scaling_min/max_freq interface, even if that would require some
additional plumbing, to adding new sysfs attributes pretty much for
the same purpose that would only be used by one driver.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ