lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d0db37ef-fbf3-98f3-e3ee-13234cf39111@nvidia.com>
Date:   Fri, 6 Oct 2023 20:44:49 +0530
From:   Sumit Gupta <sumitg@...dia.com>
To:     "Rafael J. Wysocki" <rafael@...nel.org>
CC:     <rui.zhang@...el.com>, <lenb@...nel.org>, <treding@...dia.com>,
        <jonathanh@...dia.com>, <linux-acpi@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <linux-tegra@...r.kernel.org>,
        <sanjayc@...dia.com>, <ksitaraman@...dia.com>,
        <srikars@...dia.com>, <jbrasen@...dia.com>, <bbasu@...dia.com>,
        Sumit Gupta <sumitg@...dia.com>
Subject: Re: [Patch v2 2/2] ACPI: processor: reduce CPUFREQ thermal reduction
 pctg for Tegra241


On 04/10/23 01:07, Rafael J. Wysocki wrote:
> External email: Use caution opening links or attachments
> 
> 
> On Wed, Sep 13, 2023 at 6:47 PM Sumit Gupta <sumitg@...dia.com> wrote:
>>
>> From: Srikar Srimath Tirumala <srikars@...dia.com>
>>
>> Current implementation of processor_thermal performs software throttling
>> in fixed steps of "20%" which can be too coarse for some platforms.
>> We observed some performance gain after reducing the throttle percentage.
>> Change the CPUFREQ thermal reduction percentage and maximum thermal steps
>> to be configurable. Also, update the default values of both for Nvidia
>> Tegra241 (Grace) SoC. The thermal reduction percentage is reduced to "5%"
>> and accordingly the maximum number of thermal steps are increased as they
>> are derived from the reduction percentage.
>>
>> Signed-off-by: Srikar Srimath Tirumala <srikars@...dia.com>
>> Signed-off-by: Sumit Gupta <sumitg@...dia.com>
>> ---
>>   drivers/acpi/processor_thermal.c | 41 +++++++++++++++++++++++++++++---
>>   1 file changed, 38 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/acpi/processor_thermal.c b/drivers/acpi/processor_thermal.c
>> index b7c6287eccca..30f2801abce6 100644
>> --- a/drivers/acpi/processor_thermal.c
>> +++ b/drivers/acpi/processor_thermal.c
>> @@ -26,7 +26,16 @@
>>    */
>>
>>   #define CPUFREQ_THERMAL_MIN_STEP 0
>> -#define CPUFREQ_THERMAL_MAX_STEP 3
>> +
>> +static int cpufreq_thermal_max_step = 3;
> 
> __read_mostly I suppose?
> 

Added in v3.

>> +
>> +/*
>> + * Minimum throttle percentage for processor_thermal cooling device.
> 
> + *
> 
>> + * The processor_thermal driver uses it to calculate the percentage amount by
>> + * which cpu frequency must be reduced for each cooling state. This is also used
>> + * to calculate the maximum number of throttling steps or cooling states.
>> + */
>> +static int cpufreq_thermal_pctg = 20;
> 
> __read_mostly here too?
> 

Added in v3.

>>
>>   static DEFINE_PER_CPU(unsigned int, cpufreq_thermal_reduction_pctg);
>>
>> @@ -71,7 +80,7 @@ static int cpufreq_get_max_state(unsigned int cpu)
>>          if (!cpu_has_cpufreq(cpu))
>>                  return 0;
>>
>> -       return CPUFREQ_THERMAL_MAX_STEP;
>> +       return cpufreq_thermal_max_step;
>>   }
>>
>>   static int cpufreq_get_cur_state(unsigned int cpu)
>> @@ -113,7 +122,8 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state)
>>                  if (!policy)
>>                          return -EINVAL;
>>
>> -               max_freq = (policy->cpuinfo.max_freq * (100 - reduction_pctg(i) * 20)) / 100;
>> +               max_freq = (policy->cpuinfo.max_freq *
>> +                           (100 - reduction_pctg(i) * cpufreq_thermal_pctg)) / 100;
>>
>>                  cpufreq_cpu_put(policy);
>>
>> @@ -126,10 +136,35 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state)
>>          return 0;
>>   }
>>
> 
> #ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY
> 
>> +#define SMCCC_SOC_ID_T241      0x036b0241
>> +
>> +void acpi_thermal_cpufreq_config_nvidia(void)
> 
> static void ?
> 

Added in v3.

>> +{
>> +#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY
>> +       s32 soc_id = arm_smccc_get_soc_id_version();
>> +
>> +       /* Check JEP106 code for NVIDIA Tegra241 chip (036b:0241) */
>> +       if ((soc_id < 0) || (soc_id != SMCCC_SOC_ID_T241))
> 
> Inner parens are redundant.
> 

Removed in v3.

>> +               return;
>> +
>> +       /* Reduce the CPUFREQ Thermal reduction percentage to 5% */
>> +       cpufreq_thermal_pctg = 5;
>> +
>> +       /*
>> +        * Derive the MAX_STEP from minimum throttle percentage so that the reduction
>> +        * percentage doesn't end up becoming negative. Also, cap the MAX_STEP so that
>> +        * the CPU performance doesn't become 0.
>> +        */
>> +       cpufreq_thermal_max_step = ((100 / cpufreq_thermal_pctg) - 1);
> 
> Outer parens are redundant.
> 

ACK.

>> +#endif
>> +}
> 
> #else
> static inline void void acpi_thermal_cpufreq_config_nvidia(void) {}
> #endif
> 
>> +
>>   void acpi_thermal_cpufreq_init(struct cpufreq_policy *policy)
>>   {
>>          unsigned int cpu;
>>
>> +       acpi_thermal_cpufreq_config_nvidia();
>> +
>>          for_each_cpu(cpu, policy->related_cpus) {
>>                  struct acpi_processor *pr = per_cpu(processors, cpu);
>>                  int ret;
>> --
> 
> And patch [1/2] needs to be rebased on the current ACPI thermal
> material in linux-next.
> 

Ok.

> Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ