[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <37D7C6CF3E00A74B8858931C1DB2F077018D34E1@SHSMSX103.ccr.corp.intel.com>
Date: Thu, 6 Aug 2015 23:40:16 +0000
From: "Liang, Kan" <kan.liang@...el.com>
To: Stephane Eranian <eranian@...gle.com>
CC: Peter Zijlstra <peterz@...radead.org>,
"mingo@...hat.com" <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
"ak@...ux.intel.com" <ak@...ux.intel.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH V2 1/1] perf/x86: Add Intel power cstate PMUs support
> >> On Thu, Aug 6, 2015 at 1:25 PM, Liang, Kan <kan.liang@...el.com> wrote:
> >> >
> >> >> >> >> >> +static cpumask_t power_cstate_core_cpu_mask;
> >> >> >> >> >
> >> >> >> >> > That one typically does not need a cpumask.
> >> >> >> >> >
> >> >> >> >> You need to pick one CPU out of the multi-core. But it is
> >> >> >> >> for client parts thus there is only one socket. At least
> >> >> >> >> this is my
> >> >> understanding.
> >> >> >> >>
> >> >> >> >
> >> >> >> > CORE_C*_RESIDENCY are available for physical processor core.
> >> >> >> > So logical processor in same physical processor core share
> >> >> >> > the same counter.
> >> >> >> > I think we need the cpumask to identify the default logical
> >> >> >> > processor which do counting.
> >> >> >> >
> >> >> >> Did you restrict these events to system-wide mode only?
> >> >> >>
> >> >> Ok, so that means that your cpumask includes one HT per physical
> core.
> >> >> But then, the result is not the simple aggregation of all the N/2 CPUs.
> >> >
> >> > The counter counts per physical core. The result is the aggregation
> >> > of all HT cpus in same physical core.
> >>
> >> But then don't you need to divide by 2 to get a meaningful result?
> >
> > Rethink of it. I think I was unclear about the aggregation of all HT
> > cpus in same physical core.
> >
> > physical core Cstate should equal to min(logical core C-state).
> > So only all logical core enters C6-state, the physical core enters
> > C6-state, then CORE_C6_RESIDENCY counts.
> >
> > So if we only count on one logical core/HT for CORE_C6_RESIDENCY.
> > We don't need to divide by 2. The count result is the residency when
> > all logical core in C6 (some may deeper).
> >
> Ok and here you are assuming you are only measuring one logical CPU per
> physical core. If this is the case, then I think you are alright. But I wonder
> what you'd get when perf stat -a aggregates across all measured CPUs, i.e.,
> one CPU per core.
Just add them all together.
I think we do the same thing for other PMUs as well.
For uncore or rapl, we get meaningful result by applying --per-socket.
Here we can use --per-core.
Thanks,
Kan
Powered by blists - more mailing lists