[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <172043a7-d180-292e-249b-7aea16c2209a@intel.com>
Date:   Wed, 7 Oct 2020 17:40:43 +0200
From:   "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
To:     Mel Gorman <mgorman@...hsingularity.net>
Cc:     Takashi Iwai <tiwai@...e.de>, linux-kernel@...r.kernel.org
Subject: Re: ACPI _CST introduced performance regresions on Haswll
On 10/6/2020 9:47 PM, Mel Gorman wrote:
> On Tue, Oct 06, 2020 at 08:03:22PM +0100, Mel Gorman wrote:
>> On Tue, Oct 06, 2020 at 06:00:18PM +0200, Rafael J. Wysocki wrote:
>>>> server systems") and enable-cst is the commit. It was not fixed by 5.6 or
>>>> 5.9-rc8. A lot of bisections ended up here including kernel compilation,
>>>> tbench, syscall entry/exit microbenchmark, hackbench, Java workloads etc.
>>>>
>>>> What I don't understand is why. The latencies for c-state exit states
>>>> before and after the patch are both as follows
>>>>
>>>> /sys/devices/system/cpu/cpu0/cpuidle/state0/latency:0
>>>> /sys/devices/system/cpu/cpu0/cpuidle/state1/latency:2
>>>> /sys/devices/system/cpu/cpu0/cpuidle/state2/latency:10
>>>> /sys/devices/system/cpu/cpu0/cpuidle/state3/latency:33
>>>> /sys/devices/system/cpu/cpu0/cpuidle/state4/latency:133
>>>>
>>>> Perf profiles did not show up anything interesting. A diff of
>>>> /sys/devices/system/cpu/cpu0/cpuidle/state0/ before and after the patch
>>>> showed up nothing interesting. Any idea why exactly this patch shows up
>>>> as being hazardous on Haswell in particular?
>>>>
>>> Presumably, some of the idle states are disabled by default on the affected
>>> machines.
>>>
>>> Can you check the disable and default_status attributes of each state before
>>> and after the commit in question?
>>>
>> # grep . pre-cst/cpuidle/state*/disable
> Sorry, second attempt after thinking the results made no sense at all.
> Turns out I fat fingered setting up the enable-cst kernel the second time
> to collect what you asked for and the patch was not applied at all.
>
> # grep . pre-cst/cpuidle/state*/disable
> pre-cst/cpuidle/state0/disable:0
> pre-cst/cpuidle/state1/disable:0
> pre-cst/cpuidle/state2/disable:0
> pre-cst/cpuidle/state3/disable:0
> pre-cst/cpuidle/state4/disable:0
> # grep . pre-cst/cpuidle/state*/default_status
> pre-cst/cpuidle/state0/default_status:enabled
> pre-cst/cpuidle/state1/default_status:enabled
> pre-cst/cpuidle/state2/default_status:enabled
> pre-cst/cpuidle/state3/default_status:enabled
> pre-cst/cpuidle/state4/default_status:enabled
> # grep . enable-cst/cpuidle/state*/disable
> enable-cst/cpuidle/state0/disable:0
> enable-cst/cpuidle/state1/disable:0
> enable-cst/cpuidle/state2/disable:0
> enable-cst/cpuidle/state3/disable:1
> enable-cst/cpuidle/state4/disable:1
> # grep . enable-cst/cpuidle/state*/default_status
> enable-cst/cpuidle/state0/default_status:enabled
> enable-cst/cpuidle/state1/default_status:enabled
> enable-cst/cpuidle/state2/default_status:enabled
> enable-cst/cpuidle/state3/default_status:disabled
> enable-cst/cpuidle/state4/default_status:disabled
>
> That looks like C3 and C6 are disabled after the patch.
>
> # grep . enable-cst/cpuidle/state*/name
> enable-cst/cpuidle/state0/name:POLL
> enable-cst/cpuidle/state1/name:C1
> enable-cst/cpuidle/state2/name:C1E
> enable-cst/cpuidle/state3/name:C3
> enable-cst/cpuidle/state4/name:C6
>
That's kind of unexpected and there may be two reasons for that.
First off, the MWAIT hints in the ACPI tables for C3 and C6 may be 
different from the ones in the intel_idle internal table.
Second, the ACPI tables may only be listing C1.
Can you send me the acpidump output from the affected machine, please?
Powered by blists - more mailing lists
 
