lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f0f1b31b-a0fc-4d21-8b79-c896833dae35@nvidia.com>
Date: Fri, 14 Mar 2025 18:18:43 +0530
From: Sumit Gupta <sumitg@...dia.com>
To: "zhenglifeng (A)" <zhenglifeng1@...wei.com>, "Rafael J. Wysocki"
	<rafael@...nel.org>, Viresh Kumar <viresh.kumar@...aro.org>
CC: Viresh Kumar <viresh.kumar@...aro.org>, <lenb@...nel.org>,
	<robert.moore@...el.com>, <corbet@....net>, <linux-pm@...r.kernel.org>,
	<linux-acpi@...r.kernel.org>, <linux-doc@...r.kernel.org>,
	<acpica-devel@...ts.linux.dev>, <linux-kernel@...r.kernel.org>,
	<linux-tegra@...r.kernel.org>, <treding@...dia.com>, <jonathanh@...dia.com>,
	<sashal@...dia.com>, <vsethi@...dia.com>, <ksitaraman@...dia.com>,
	<sanjayc@...dia.com>, <bbasu@...dia.com>, Sumit Gupta <sumitg@...dia.com>
Subject: Re: [Patch 0/5] Support Autonomous Selection mode in cppc_cpufreq



>>>
>>> There seems to be some quite fundamental disagreement on how this
>>> should be done, so I'm afraid I cannot do much about it ATM.
>>>
>>> Please agree on a common approach and come back to me when you are ready.
>>>
>>> Sending two concurrent patchsets under confusingly similar names again
>>> and again isn't particularly helpful.
>>>
>>> Thanks!
>>
>> Hi Rafael,
>>
>> Thank you for looking into this.
>>
>> Hi Lifeng,
>>
>> As per the discussion, we can make the driver future extensible and
>> also can optimize the register read/write access.
>>
>> I gave some thought and below is my proposal.
>>
>> 1) Pick 'Patch 1-7' from your patch series [1] which optimize API's
>>     to read/write a cpc register.
>>
>> 2) Pick my patches in [2]:
>>     - Patch 1-4: Keep all cpc registers together under acpi_cppc sysfs.
>>                  Also, update existing API's to read/write regs in batch.
>>     - Patch 5: Creates 'cppc_cpufreq_epp_driver' instance for booting
>>       all CPU's in Auto mode and set registers with right values.
>>       They can be updated after boot from sysfs to change hints to HW.
>>       I can use the optimized API's from [1] where required in [2].
>>
>> Let me know if you are okay with this proposal.
>> I can also send an updated patch series with all the patches combined?
>>
>> [1] https://lore.kernel.org/all/20250206131428.3261578-1-zhenglifeng1@huawei.com/
>> [2] https://lore.kernel.org/lkml/20250211103737.447704-1-sumitg@nvidia.com/
>>
>> Regards,
>> Sumit Gupta
>>
> 
> Hi Sumit,
> 
> Over the past few days, I've been thinking about your proposal and
> scenario.
> 
> I think we both agree that PATCH 1-7 in [1] doesn't conflicts with [2], so
> the rest of the discussion focuses on the differences between [2] and the
> PATCH 8 in [1].
> 
> We both tried to support autonomous selection mode in cppc_cpufreq but on
> different ways. I think the differences between these two approaches can be
> summarized into three questions:
> 
> 1. Which sysfs files to expose? I think this is not a problem, we can keep
> all of them.
> 
> 2. Where to expose these sysfs files? I understand your willing to keep all
> cpc registers together under acpi_cppc sysfs. But in my opinion, it is more
> suitable to expose them under cppc_cpufreq_attr, for these reasons:
> 
>    1) It may probably introduce concurrency and data consistency issues, as
> I mentioned before.
> 

As explained in previous reply, this will be solved with the ifdef
check to enable the attributes for only those CPUFREQ drivers which want
to use the generic nodes.
  e.g: '#ifdef CONFIG_ACPI_CPPC_CPUFREQ' for 'cppc_cpufreq'.
			
These CPC register read/write sysfs nodes are generic as per the ACPI
specification and without any vendor specific logic.
So, adding them in the lib file 'cppc_acpi.c'(CONFIG_ACPI_CPPC_LIB) will
avoid code duplication if a different or new ACPI based CPUFREQ driver
also wants to use them just by adding their macro check. Such ifdef 
check is also used in other places for attributes creation like below.
So, don't look like a problem.
  $ grep -A4 "acpi_cpufreq_attr\[" drivers/cpufreq/acpi-cpufreq.c
  static struct freq_attr *acpi_cpufreq_attr[] = {
	&freqdomain_cpus,
  #ifdef CONFIG_X86_ACPI_CPUFREQ_CPB
	&cpb,
  #endif

>    2) The store functions call cpufreq_cpu_get() to get policy and update
> the driver_data which is a cppc_cpudata. Only the driver_data in
> cppc_cpufreq's policy is a cppc_cpudata! These operations are inappropriate
> in cppc_acpi. This file currently provides interfaces for cpufreq drivers
> to use. Reverse calls might mess up call relationships, break code
> structures, and cause problems that are hard to pinpoint the root cause!
> 

If we don't want to update the cpufreq policy from 'cppc_acpi.c' and 
only update it from within the cpufreq,	then this could be one valid
point to not add the write syfs nodes in 'cppc_acpi.c' lib file.

@Rafael, @Viresh : Do you have any comments on this?

>    3) Difficult to extend. Different cpufreq drivers may have different
> processing logic when reading from and writing to these CPC registers.
> Limiting all sysfs here makes it difficult for each cpufreq driver to
> extend. I think this is why there are only read-only interfaces under
> cppc_attrs before.
> 

We are updating the CPC registers as per the generic ACPI specification.
So, any ACPI based CPUFREQ driver can use these generic nodes to
read/write reg's until they have a vendor specific requirement or
implementation.
As explained above, If someone wants to update in different way and use
their own CPUFREQ driver then these generic attributes won't be created
due to the CPUFREQ driver macro check.
I think AMD and Intel are doing more than just reading/updating the 
registers. That's why they needed their driver specific implementations.

> Adding a 'ifdef' is not a good way to solve these problems. Defining this
> config does not necessarily mean that the cpufreq driver is cppc_cpufreq.
> 

It means that only.
  ./drivers/cpufreq/Makefile:obj-$(CONFIG_ACPI_CPPC_CPUFREQ) += 
cppc_cpufreq.o

> 3. Is it necessary to add a new driver instance? [1] exposed the sysfs
> files to support users dynamically change the auto selection mode of each
> policy. Each policy can be operated seperately. It seems to me that if you
> want to boot all CPUs in auto mode, it should be sufficient to set all
> relevant registers to the correct values at boot time. I can't see why the
> new instance is necessary unless you explain it further. Could you explain
> more about why you add a new instance starting from answer these questions:
> 
> For a specific CPU, what is the difference between using the two instances
> when auto_sel is 1? And what is the difference when auto_sel is 0?
> 

Explained this in previous reply. Let me elaborate more.

For hundred's of CPU's, we don't need to explicitly set multiple sysfs
after boot to enable and configure Auto mode with right params. That's 
why an easy option is to pass boot argument or module param for enabling
and configuration.
A separate instance 'cppc_cpufreq_epp' of the 'cppc_cpufreq' driver is
added because policy min/max need to be updated to the min/max_perf
and not nominal/lowest nonlinear perf which is done by the default
init hook. Min_perf value can be lower than lowest nonlinear perf and 
Max_perf can be higher than nominal perf.
If some CPU is booted with epp instance and later the auto mode is 
disabled or min/max_perf is changed from sysfs then also the policy
min/max need to be updated accordingly.

Another is that in Autonomous mode the freq selection and setting is
done by HW. So, cpufreq_driver->target() hook is not needed.
These are few reasons which I am aware of as of now.
I think in future there can be more. Having a separate instance
reflecting a HW based Autonomous frequency selection will make it easy
for any future changes.

> If it turns out that the new instance is necessary, I think we can reach a
> common approach by adding this new cpufreq driver instance and place the
> attributes in 'cppc_cpufreq_epp_attr', like amd-pstate did.
> 
> What do you think?

I initially thought about this but there was a problem.
What if we boot with non-epp instance which doesn't have these 
attributes and later want to enable Auto mode for few CPU's from sysfs.


Best Regards,
Sumit Gupta

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ