linux-kernel - RE: [intel-pstate driver regression] processor frequency very high even if in idle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 1 Apr 2016 16:36:51 -0700
From:	"Doug Smythies" <dsmythies@...us.net>
To:	"'Rafael J. Wysocki'" <rafael@...nel.org>
Cc:	"'Srinivas Pandruvada'" <srinivas.pandruvada@...ux.intel.com>,
	'Jörg Otte' <jrg.otte@...il.com>,
	"'Rafael J. Wysocki'" <rjw@...ysocki.net>,
	"'Linux Kernel Mailing List'" <linux-kernel@...r.kernel.org>,
	"'Linux PM list'" <linux-pm@...r.kernel.org>
Subject: RE: [intel-pstate driver regression] processor frequency very high even if in idle

On 2016.04.01 12:54 Rafael J. Wysocki wrote:
>On Fri, Apr 1, 2016 at 8:31 PM, Doug Smythies <dsmythies@...us.net> wrote:
>> On 2106.034.01 10:45 Srinivas Pandruvada wrote:
>>> On Fri, 2016-04-01 at 16:06 +0200, Jörg Otte wrote:
>> > > > > >
>>>> Done. Attached the tracer.
>>>> For me it looks like the previous one of the failing case.
>>>
>>> The traces show that idle task is constantly running without sleep.
>>
>> No, they (at least the first one, I didn't look at the next one yet)
>> show that CPUs 2 and 3 are spending around 99% of their time not in state
>> C0.

> How do you figure that out if I may ask?  It is not so obvious to me
> to be honest.

The trace was not in the form for the post processing tools, so I had
to manually import the trace into a spreadsheet and manually add new columns
calculated from the others.

Load = mperf / tsc * 100 % = C0 time.
Duration (mS) = tsc / 2.5e9 * 1000 
Note: I do not recall seeing an exact tsc for Jörg's computer, so I used
The 2.5 GHz from the device spec from some earlier e-mail.

Example (formatting will likely not send O.K.):

		CPU#	time		core_busy	scaled	from	to	mperf		aperf		tsc		freq		load		duration (ms)
<idle>-0	[002]	465.879451:	100		96		26	26	1826656	1826710	25062693	2500073	7.288%	10.025
<idle>-0	[003]	465.879484:	99		96		26	26	305796	305781	25147993	2499877	1.216%	10.059
<idle>-0	[000]	465.885794:	100		96		26	26	975908	975951	32434672	2500110	3.009%	12.974
<idle>-0	[001]	465.886898:	100		250		10	31	327356	327364	26673840	2500061	1.227%	10.670
<idle>-0	[002]	465.889527:	100		96		26	26	205336	205365	25133396	2500353	0.817%	10.053
<idle>-0	[003]	465.889555:	99		95		26	26	62544		62341		25117916	2491885	0.249%	10.047

> That the sample rate is ending up at ~10 Milliseconds, indicates some
> high frequency (>= 100Hz) events on those CPUs. Those events, apparently,
> take very little CPU time to complete, hence a load of about 1% on average.
>
> By the way, I can recreate the high sample rate with virtually no load
> on my system easy, but so far have been unable to get the high CPU
> frequencies observed by Jörg. I can get my system to about a target pstate of
> 20 where it should have remained at 16, but that is about it.
>
>> The driver is processing samples for idle task for every 10ms and
>> aperf/mperf are showing that we are always in turbo mode for idle task.
>
> That column pretty much always says "idle" (or swapper for my way of doing
> things). I have not found it to very useful as an indicator, and considerably
> more so since the utilization changes.
>
>>
>> Need to find out why idle task is not sleeping.
>
> I contend that is it.

Why?

Unless I misunderstood, because the trace data indicates that the those CPUs
are going into some deeper C stsate than C0 for most of their time.

... Doug