[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <251f40a6-9dd6-9611-ce77-b8eee4cf8fc8@amd.com>
Date: Mon, 30 Sep 2019 15:34:09 +0000
From: "Natarajan, Janakarajan" <Janakarajan.Natarajan@....com>
To: shuah <shuah@...nel.org>,
"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC: Thomas Renninger <trenn@...e.com>, Pu Wen <puwen@...on.com>,
Borislav Petkov <bp@...e.de>,
Allison Randal <allison@...utok.net>,
Thomas Gleixner <tglx@...utronix.de>,
Kate Stewart <kstewart@...uxfoundation.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Richard Fontana <rfontana@...hat.com>
Subject: Re: [PATCH 1/2] Modify cpupower to schedule itself on cores it is
reading MSRs from
On 9/27/2019 1:59 PM, shuah wrote:
> On 9/18/19 10:34 AM, Natarajan, Janakarajan wrote:
>> Modify cpupower to schedule itself on each of the cpus in the system and
>> then get the APERF/MPERF register values.
>>
>> This is advantageous because an IPI is not generated when a
>> read_msr() is
>> executed on the local logical CPU thereby reducing the chance of having
>> APERF and MPERF being out of sync.
>
> Somehow this doesn't read right. Is this that you are trying to avoid
> APERF and MPERF being out of sync with this change?
>
> This description is rather confusing.
We are trying to avoid a separate IPI for APERF and MPERF. We have seen
that a separate IPI for
each can cause APERF and MPERF to go out of sync.
If the cpupower is already executing on the core it wants to get the
APERF/MPERF from, then
generic_exec_single() does not generate the IPI to execute the rdmsr().
Rather it just executes
on the current cpu.
I can change the description to add more detail.
>
>>
>> Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@....com>
>> ---
>> .../utils/idle_monitor/mperf_monitor.c | 38 ++++++++++++++-----
>> 1 file changed, 28 insertions(+), 10 deletions(-)
>>
>> diff --git a/tools/power/cpupower/utils/idle_monitor/mperf_monitor.c
>> b/tools/power/cpupower/utils/idle_monitor/mperf_monitor.c
>> index 44806a6dae11..8b072e39c897 100644
>> --- a/tools/power/cpupower/utils/idle_monitor/mperf_monitor.c
>> +++ b/tools/power/cpupower/utils/idle_monitor/mperf_monitor.c
>> @@ -10,6 +10,7 @@
>> #include <stdlib.h>
>> #include <string.h>
>> #include <limits.h>
>> +#include <sched.h>
>> #include <cpufreq.h>
>> @@ -86,15 +87,33 @@ static int mperf_get_tsc(unsigned long long *tsc)
>> return ret;
>> }
>> +static int get_aperf_mperf(int cpu, unsigned long long *aval,
>> + unsigned long long *mval)
>> +{
>> + cpu_set_t set;
>> + int ret;
>> +
>> + CPU_ZERO(&set);
>> + CPU_SET(cpu, &set);
>> + if (sched_setaffinity(getpid(), sizeof(set), &set) == -1) {
>> + dprint("Could not migrate to cpu: %d\n", cpu);
>> + return 1;
>> + }
>> +
>> + ret = read_msr(cpu, MSR_APERF, aval);
>> + ret |= read_msr(cpu, MSR_MPERF, mval);
>> +
>> + return ret;
>> +}
>> +
>> static int mperf_init_stats(unsigned int cpu)
>> {
>> - unsigned long long val;
>> + unsigned long long aval, mval;
>> int ret;
>> - ret = read_msr(cpu, MSR_APERF, &val);
>> - aperf_previous_count[cpu] = val;
>> - ret |= read_msr(cpu, MSR_MPERF, &val);
>> - mperf_previous_count[cpu] = val;
>> + ret = get_aperf_mperf(cpu, &aval, &mval);
>
> get_aperf_mperf() could return error right? It returns 1 when
> sched_setaffinity() fails. Shouldn't the return value checked,
> instead of using aval and mval?
We set the is_valid[cpu] to the return value. Later on the is_valid[cpu]
is checked before proceeding to calculate the effective freq.
Thanks.
>
>> + aperf_previous_count[cpu] = aval;
>> + mperf_previous_count[cpu] = mval;
>> is_valid[cpu] = !ret;
>> return 0;
>> @@ -102,13 +121,12 @@ static int mperf_init_stats(unsigned int cpu)
>> static int mperf_measure_stats(unsigned int cpu)
>> {
>> - unsigned long long val;
>> + unsigned long long aval, mval;
>> int ret;
>> - ret = read_msr(cpu, MSR_APERF, &val);
>> - aperf_current_count[cpu] = val;
>> - ret |= read_msr(cpu, MSR_MPERF, &val);
>> - mperf_current_count[cpu] = val;
>> + ret = get_aperf_mperf(cpu, &aval, &mval);
>
> Same comments as above here.
>
>> + aperf_current_count[cpu] = aval;
>> + mperf_current_count[cpu] = mval;
>> is_valid[cpu] = !ret;
>> return 0;
>>
>
> thanks,
> -- Shuah
Powered by blists - more mailing lists