[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f516dd07-7d7b-403e-a55e-6bf21dbea9b4@kernel.org>
Date: Thu, 26 Sep 2024 13:09:10 -0500
From: Mario Limonciello <superm1@...nel.org>
To: Antheas Kapenekakis <lkml@...heas.dev>,
Shyam Sundar S K <Shyam-sundar.S-k@....com>
Cc: "Rafael J . Wysocki" <rafael@...nel.org>,
Hans de Goede <hdegoede@...hat.com>,
Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
"Luke D . Jones" <luke@...nes.dev>, Mark Pearson
<mpearson-lenovo@...ebb.ca>,
"open list:AMD PMF DRIVER" <platform-driver-x86@...r.kernel.org>,
open list <linux-kernel@...r.kernel.org>,
"open list:ACPI" <linux-acpi@...r.kernel.org>,
"Derek J . Clark" <derekjohn.clark@...il.com>, me@...egospodneti.ch,
Denis Benato <benato.denis96@...il.com>,
Mario Limonciello <mario.limonciello@....com>
Subject: Re: [RFC 2/2] platform/x86/amd: pmf: Add manual control support
On 9/26/2024 06:00, Antheas Kapenekakis wrote:
> Hi Shyam,
>
>> I appreciate the proposal, but giving users this control seems similar
>> to using tools like Ryzenadj or Ryzen Master, which are primarily for
>> overclocking. Atleast Ryzen Master has a dedicated mailbox with PMFW.
>
> In the laptop market I agree with you. However, in the handheld
> market, users expect to be able to lower the power envelope of the
> device on demand in a granular fashion. As the battery drop is
> measured in Watts, tying a slider to Watts is a natural solution.
>
> Most of the time, when those controls are used it is to limit the
> thermal envelope of the device, not exceed it. We want to remove the
> use of these tools and allow manufacturers the ability to customise
> the power envelope they offer to users.
>
>> While some existing PMF mailboxes are being deprecated, and SPL has
>> been removed starting with Strix[1] due to the APTS method.
Hmm, what do you think about about offering a wrapper for this for
people to manipulate?
>>
>> It's important to use some settings together rather than individually
>> (which the users might not be aware of). For instance, updating SPL
>> requires corresponding updates to STT limits to avoid negative outcomes.
>
The tough part about striking the balance here is how would an end user
know what values to set in tandem. I think a lot of people just assume
they can "just change SPL" and that's it and have a good experience.
> This suggestion was referring to a combined slider, much like the
> suggestion below. So STT limits would be modified in tandem,
> respecting manufacturer profiles. See comments below.
>
> If you find the name SPL disagreeable, it could be named {tdp,
> tdp_min, tdp_max}. This is the solution used by Valve on the Steam
> Deck (power1_cap{+min,max}, power2_cap{+min,max}).
It's not so much that it's disagreeable term but Shyam is pointing out
that SPL is no longer a valid argument to the platform mailbox.
>
> In addition, boost is seen as detrimental to handheld devices, with
> most users disliking and disabling it. Steam Deck does not use boost.
> It is disabled by Steam (power1_cap == power2_cap). So STT and STAPM
> are not very relevant. In addition, Steam Deck van gogh has a more
> linear response so TDP limits are less required.
>
>> Additionally, altering these parameters can exceed thermal limits and
>> potentially void warranties.
>>
>> Considering CnQF, why not let OEMs opt-in and allow the algorithm to
>> manage power budgets, rather than providing these controls to users
>> from the kernel when userspace tools already exist?
The problem is all of the RE tools rely upon PCI config space access or
/dev/mem access to manipulate undocumented register offsets.
When the system is under kernel lockdown (such as with distro kernel
when UEFI secure boot is turned on) then those interfaces are
intentionally locked down.
That's why I'm hoping we can strike some sort of balance at the request
for some advanced users being able to tune values in a predictable
fashion while also allowing OEMs to configure policies like CNQF or
Smart PC when users for users that don't tinker.
>>
>> Please note that on systems with Smart PC enabled, if users manually
>> adjust the system thermals, it can lead to the thermal controls
>> becoming unmanageable.
Yeah; that's why as this RFC patch I didn't let CNQF, ITS or Smart PC
initialize. Basically if manual control is enabled then "SPS" and
manual sysfs control is the only thing available.
>
> Much like you, we dislike AutoTDP solutions that use e.g., RyzenAdj, as they:
> 1) Do not respect manufacturer limits
> 2) Cause system instability such as stutters when setting values
> 3) Can cause crashes if they access the mailbox at the same time as
> the AMD drm driver.
>
Yes. Exactly why I feel that if we offer an interface instead people
can use such an interface instead of these tools.
Powered by blists - more mailing lists