[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <63dfa6a1-0424-7985-7803-756c0c5cc4a5@redhat.com>
Date: Tue, 13 Oct 2020 15:04:50 +0200
From: Hans de Goede <hdegoede@...hat.com>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: Daniel Lezcano <daniel.lezcano@...aro.org>,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
Lukasz Luba <lukasz.luba@....com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux PM <linux-pm@...r.kernel.org>,
"Zhang, Rui" <rui.zhang@...el.com>,
Bastien Nocera <hadess@...ess.net>,
Mark Pearson <mpearson@...ovo.com>
Subject: Re: [PATCH 0/4] powercap/dtpm: Add the DTPM framework
Hi Rafael,
On 10/12/20 6:37 PM, Rafael J. Wysocki wrote:
> On Mon, Oct 12, 2020 at 1:46 PM Hans de Goede <hdegoede@...hat.com> wrote:
<snip>
>>> A side note, related to your proposal, not this patch. IMO it suits
>>> better to have /sys/power/profile.
>>>
>>> cat /sys/power/profile
>>>
>>> power
>>> balanced_power *
>>> balanced
>>> balanced_performance
>>> performance
>>>
>>> The (*) being the active profile.
>>
>> Interesting the same thing was brought up in the discussion surrounding
>> RFC which I posted.
>>
>> The downside against this approach is that it assumes that there
>> only is a single system-wide settings. AFAIK that is not always
>> the case, e.g. (AFAIK):
>>
>> 1. The intel pstate driver has something like this
>> (might this be the rapl setting you mean? )
>>
>> 2. The X1C8 has such a setting for the embedded-controller, controlled
>> through the ACPI interfaces which thinkpad-acpi used
>>
>> 3. The hp-wmi interface allows selecting a profile which in turn
>> (through AML code) sets a bunch of variables which influence how
>> the (dynamic, through mjg59's patches) DPTF code controls various
>> things
>>
>> At least the pstate setting and the vendor specific settings can
>> co-exist. Also the powercap API has a notion of zones, I can see the
>> same thing here, with a desktop e.g. having separate performance-profile
>> selection for the CPU and a discrete GPU.
>>
>> So limiting the API to a single /sys/power/profile setting seems a
>> bit limited and I have the feeling we will regret making this
>> choice in the future.
>>
>> With that said your proposal would work well for the current
>> thinkpad_acpi / hp-wmi cases, so I'm not 100% against it.
>>
>> This would require adding some internal API to the code which
>> owns the /sys/power root-dir to allow registering a profile
>> provider I guess. But that would also immediately bring the
>> question, what if multiple drivers try to register themselves
>> as /sys/power/profile provider ?
>
> It doesn't need to work this way IMV.
>
> It may also work by allowing drivers (or whatever kernel entities are
> interested in that) to subscribe to it, so that they get notified
> whenever a new value is written to it by user space (eg. each driver
> may be able to register a callback to be invoked when that happens).
> The information coming from user space will just be passed to the
> subscribers of that interface and they will do about it what they want
> (eg. it may be translated into a value to be written to a
> performance-vs-power interface provided by the platform or similar).
>
> This really is similar to having a class interface with one file per
> "subscribed" device except that the aggregation is done in the kernel
> and not in user space and the subscribers need not be related to
> specific devices. It still allows to avoid exposing the low-level
> interfaces to user space verbatim and it just passes the "policy"
> choice from user space down to the entities that can take it into
> account.
First of all thank you for your input, with your expertise in this
area your input is very much appreciated, after all we only get
one chance to get the userspace API for this right.
Your proposal to have a single sysfs file for userspace to talk
to and then use an in kernel subscription mechanism for drivers
to get notified of writes to this file is interesting.
But I see 2 issues with it:
1. How will userspace know which profiles are actually available ?
An obvious solution is to pick a set of standard names and let
subscribers map those as close to their own settings as possible,
the most often mentioned set of profile names in this case seems to be:
low_power
balanced_power
balanced
balanced_performance
performance
Which works fine for the thinkpad_acpi case, but not so much for
the hp-wmi case. In the HP case what happens is that a WMI call
is made which sets a bunch of ACPI variables which influence
the DPTF code (this assumes we have some sort of DPTF support
such as mjg59's reverse engineered support) but the profile-names
under Windows are: "Performance", "HP recommended", "Cool" and
"Quiet". If you read the discussion from the
"[RFC] Documentation: Add documentation for new performance_profile sysfs class"
thread you will see this was brought up as an issue there.
The problem here is that both "cool" and "quiet" could be
interpreted as low-power. But it seems that they actually mean
what they say, cool focuses on keeping temps low, which can
also be done by making the fan-profile more aggressive. And quiet
is mostly about keeping fan speeds down, at the cost of possible
higher temperatures. IOW we don't really have a 1 dimensional
axis. My class proposal fixes this by having a notion of both
standardized names (because anything else would suck) combined
with a way for drivers to advertise which standardized names
the support. So in my proposal I simply add quiet and cool
to the list of standard profile names, and then the HP-wmi
driver can list those as supported, while not listing
low_power as a supported profile. This way we export the
hardware interface to userspace as is (as much as possible)
while still offering a standardized interface for userspace
to consume. Granted if userspace now actually want to set
a low_power profile, we have just punted the problem to userspace
but I really do not see a better solution.
2. This only works assuming that all performance-profiles
are system wide. But given a big desktop case there might
be very well be separate cooling zones for e.g. the CPU
and the GPU and I can imagine both having separate
performance-profile settings and some users will doubtlessly
want to be able to control these separately ...
Regards,
Hans
Powered by blists - more mailing lists