[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1610826.tK4V65ABJD@vostro.rjw.lan>
Date: Thu, 11 Sep 2014 00:49:48 +0200
From: "Rafael J. Wysocki" <rjw@...ysocki.net>
To: Anup Chenthamarakshan <anupc@...omium.org>
Cc: Dirk Brandewie <dirk.brandewie@...il.com>,
Sameer Nanda <snanda@...omium.org>,
Viresh Kumar <viresh.kumar@...aro.org>,
linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] intel_pstate: track and export frequency residency stats via sysfs.
On Wednesday, September 10, 2014 03:15:08 PM Anup Chenthamarakshan wrote:
> On Wed, Sep 10, 2014 at 09:39:30AM -0700, Dirk Brandewie wrote:
> > On 09/09/2014 04:22 PM, Anup Chenthamarakshan wrote:
> > >On Tue, Sep 09, 2014 at 08:15:13AM -0700, Dirk Brandewie wrote:
> > >>On 09/08/2014 05:10 PM, Anup Chenthamarakshan wrote:
> > >>>Exported stats appear in
> > >>><sysfs>/devices/system/cpu/intel_pstate/time_in_state as follows:
> > >>>
> > >>>## CPU 0
> > >>>400000 3647
> > >>>500000 24342
> > >>>600000 144150
> > >>>700000 202469
> > >>>## CPU 1
> > >>>400000 4813
> > >>>500000 22628
> > >>>600000 149564
> > >>>700000 211885
> > >>>800000 173890
> > >>>
> > >>>Signed-off-by: Anup Chenthamarakshan <anupc@...omium.org>
> > >>
> > >>What is this information being used for?
> > >
> > >I'm using P-state residency information in power consumption tests to calculate
> > >proportion of time spent in each P-state across all processors (one global set
> > >of percentages, corresponding to each P-state). This is used to validate new
> > >changes from the power perspective. Essentially, sanity checks to flag changes
> > >with large difference in P-state residency.
> > >
> > >So far, we've been using the data exported by acpi-cpufreq to track this.
> > >
> > >>
> > >>Tracking the current P state request for each core is only part of the
> > >>story. The processor aggregates the requests from all cores and then decides
> > >>what frequency the package will run at, this evaluation happens at ~1ms time
> > >>frame. If a core is idle then it loses its vote for that package frequency will
> > >>be and its frequency will be zero even though it may have been requesting
> > >>a high P state when it went idle. Tracking the residency of the requested
> > >>P state doesn't provide much useful information other than ensuring the the
> > >>requests are changing over time IMHO.
> > >
> > >This is exactly why we're trying to track it.
> >
> > My point is that you are tracking the residency of the request and not
> > the P state the package was running at. On a lightly loaded system
> > it is not unusual for a core that was very busy and requesting a high
> > P state to go idle for several seconds. In this case that core would
> > lose its vote for the package P state but the stats would show that
> > the P state was high for a very long time when its real frequency
> > was zero.
>
> I see what you're saying. Requesting a p-state does not necessarily mean that is
> the state the CPU is in.
>
> >
> > There are a couple of ways to get what I consider better information
> > about what is actually going on.
> >
> > The current turbostat provides C state residency and calculates the
> > average/effective frequency of the core over its sample time.
> > Turbostat will also measure the power consumption from the CPU point
> > of view if your processor supports the RAPL registers.
> >
> > Reading MSR 0x198 MSR_IA32_PERF_STATUS will tell you what the core
> > would run at if it not idle, this reflects the decision that the
> > package made based on current requests.
> >
> > Using perf to collect power:pstate_sample event will give information
> > about each sample on the core and give you timestamps to detect idle
> > times.
> >
> > Using perf to collect power:cpu_frequency will show when the P state
> > request was changed on each core and is triggered by intel_pstate and
> > acpi_cpufreq.
> >
> > Powertop collects that same information as turbostat and a bunch of
> > other information useful in seeing where you could be burning power
> > for no good reason.
> >
> > For getting an idea of real power turbostat is the easiest to use and
> > is available on most systems. Using perf will give you a very fine grained
> > view of what is going on as well as point to the culprit for bad
> > behaviour in most cases.
>
> Tools like powertop and turbostat are not present by default on all systems,
> so it is not always possible to use them :(
Which systems are you referring to in particular?
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists