lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <39d37e1b-7959-9a8f-6876-f2ed4c1dbc37@huawei.com>
Date:   Thu, 4 Jun 2020 09:32:41 +0800
From:   Xiongfeng Wang <wangxiongfeng2@...wei.com>
To:     "Rafael J. Wysocki" <rafael@...nel.org>,
        Viresh Kumar <viresh.kumar@...aro.org>
CC:     "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Hanjun Guo <guohanjun@...wei.com>,
        Sudeep Holla <Sudeep.Holla@....com>,
        Ionela Voinescu <ionela.voinescu@....com>,
        Linux PM <linux-pm@...r.kernel.org>,
        "Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>
Subject: Re: [Question]: about 'cpuinfo_cur_freq' shown in sysfs when the CPU
 is in idle state

Hi Rafael,

Thanks for your reply !

On 2020/6/3 21:39, Rafael J. Wysocki wrote:
> On Wed, Jun 3, 2020 at 9:52 AM Viresh Kumar <viresh.kumar@...aro.org> wrote:
>>
>> On 02-06-20, 11:34, Xiongfeng Wang wrote:
>>> Hi Viresh,
>>>
>>> Sorry to disturb you about another problem as follows.
>>>
>>> CPPC use the increment of Desired Performance counter and Reference Performance
>>> counter to get the CPU frequency and show it in sysfs through
>>> 'cpuinfo_cur_freq'. But ACPI CPPC doesn't specifically define the behavior of
>>> these two counters when the CPU is in idle state, such as stop incrementing when
>>> the CPU is in idle state.
>>>
>>> ARMv8.4 Extension inctroduced support for the Activity Monitors Unit (AMU). The
>>> processor frequency cycles and constant frequency cycles in AMU can be used as
>>> Delivered Performance counter and Reference Performance counter. These two
>>> counter in AMU does not increase when the PE is in WFI or WFE. So the increment
>>> is zero when the PE is in WFI/WFE. This cause no issue because
>>> 'cppc_get_rate_from_fbctrs()' in cppc_cpufreq driver will check the increment
>>> and return the desired performance if the increment is zero.
>>>
>>> But when the CPU goes into power down idle state, accessing these two counters
>>> in AMU by memory-mapped address will return zero. Such as CPU1 went into power
>>> down idle state and CPU0 try to get the frequency of CPU1. In this situation,
>>> will display a very big value for 'cpuinfo_cur_freq' in sysfs. Do you have some
>>> advice about this problem ?
>>>
>>> I was thinking about an idea as follows. We can run 'cppc_cpufreq_get_rate()' on
>>> the CPU to be measured, so that we can make sure the CPU is in C0 state when we
>>> access the two counters. Also we can return the actual frequency rather than
>>> desired performance when the CPU is in WFI/WFE. But this modification will
>>> change the existing logical and I am not sure if this will cause some bad effect.
>>>
>>>
>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>>> index 257d726..ded3bcc 100644
>>> --- a/drivers/cpufreq/cppc_cpufreq.c
>>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>>> @@ -396,9 +396,10 @@ static int cppc_get_rate_from_fbctrs(struct cppc_cpudata *cpu,
>>>         return cppc_cpufreq_perf_to_khz(cpu, delivered_perf);
>>>  }
>>>
>>> -static unsigned int cppc_cpufreq_get_rate(unsigned int cpunum)
>>> +static int cppc_cpufreq_get_rate_cpu(void *info)
>>>  {
>>>         struct cppc_perf_fb_ctrs fb_ctrs_t0 = {0}, fb_ctrs_t1 = {0};
>>> + unsigned int cpunum = *(unsigned int *)info;
>>>         struct cppc_cpudata *cpu = all_cpu_data[cpunum];
>>>         int ret;
>>>
>>> @@ -418,6 +419,22 @@ static unsigned int cppc_cpufreq_get_rate(unsigned int cpunum)
>>>         return cppc_get_rate_from_fbctrs(cpu, fb_ctrs_t0, fb_ctrs_t1);
>>>  }
>>>
>>> +static unsigned int cppc_cpufreq_get_rate(unsigned int cpunum)
>>> +{
>>> + unsigned int ret;
>>> +
>>> + ret = smp_call_on_cpu(cpunum, cppc_cpufreq_get_rate_cpu, &cpunum, true);
>>> +
>>> + /*
>>> +  * convert negative error code to zero, otherwise we will display
>>> +  * an odd value for 'cpuinfo_cur_freq' in sysfs
>>> +  */
>>> + if (ret < 0)
>>> +         ret = 0;
>>> +
>>> + return ret;
>>> +}
>>> +
>>>  static int cppc_cpufreq_set_boost(struct cpufreq_policy *policy, int state)
>>>  {
>>>         struct cppc_cpudata *cpudata;
>>
>> I don't see any other sane solution, even if this brings the CPU back
>> to normal state and waste power. We should be able to reliably provide
>> value to userspace.
>>
>> Rafael / Sudeep: What you do say ?
> 
> The frequency value obtained by kicking the CPU out of idle
> artificially is bogus, though.  You may as well return a random number
> instead.

Yes, it may return a randowm number as well.

> 
> The frequency of a CPU in an idle state is in fact unknown in the case
> at hand, so returning 0 looks like the cleanest option to me.

I am not sure about how the user will use 'cpuinfo_cur_freq' in sysfs. If I
return 0 when the CPU is idle, when I run a light load on the CPU, I will get a
zero value for 'cpuinfo_cur_freq' when the CPU is idle. When the CPU is not
idle, I will get a non-zero value. The user may feel odd about
'cpuinfo_cur_frreq' switching between a zero value and a non-zero value. They
may hope it can return the frequency when the CPU execute instructions, namely
in C0 state. I am not so sure about the user will look at 'cpuinfo_cur_freq'.

Thanks,
Xiongfeng

> 
> Thanks!
> 
> .
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ