linux-kernel - Re: [PATCH 1/3] cpuidle,x86: increase forced cut-off for polling to 20us

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <563218DD.302@linaro.org>
Date:	Thu, 29 Oct 2015 14:02:21 +0100
From:	Daniel Lezcano <daniel.lezcano@...aro.org>
To:	Rik van Riel <riel@...hat.com>, linux-kernel@...r.kernel.org
Cc:	arjan@...ux.intel.com, khilman@...com, len.brown@...el.com,
	rafael.j.wysocki@...el.com, javi.merino@....com,
	tuukka.tikkanen@...aro.org
Subject: Re: [PATCH 1/3] cpuidle,x86: increase forced cut-off for polling to
 20us

On 10/29/2015 12:54 PM, Rik van Riel wrote:
> On 10/29/2015 06:17 AM, Daniel Lezcano wrote:
>> On 10/28/2015 11:46 PM, riel@...hat.com wrote:
>>> From: Rik van Riel <riel@...hat.com>
>>>
>>> The cpuidle menu governor has a forced cut-off for polling at 5us,
>>> in order to deal with firmware that gives the OS bad information
>>> on cpuidle states, leading to the system spending way too much time
>>> in polling.
>>
>> May be I am misunderstanding your explanation but it is not how I read
>> the code.
>>
>> The default idle state is C1 (hlt) if no other states suits the
>> constraint. If a timer is happening really soon, then set the default
>> idle state to POLL if no other idle state suits the constraint.
>>
>> That applies only on x86.
>
> With the current code, the default idle state is C1 (hlt) even if
> C1 does not suit the constraint.
>
>> This is not related to break-even but exit latency.
>
> Why would we not care about break-even for C1?
>
> On systems where going into C1 for too-short periods wastes
> power, why would we waste the power when we expect a very
> short sleep?
>
>> IMO, we should just drop this 5us and the POLL state selection in the
>> menu governor as we have since a while hyper fast C1 exit. Except a few
>> embedded processors where polling is not adequate.
>
> We have hyper fast C1 exit on Nehalem and newer high performance
> chips. On those chips, we will pick C1 (or deeper) when we have
> an expected sleep time of just a few microseconds.
>
> However, on Atom, and for the paravirt cpuidle driver I am
> working on, C1 exit latency and target residence are higher
> than the cut-off hardcoded in the menu governor.
>
>> Furthermore, the number of times the poll state is selected vs the other
>> states is negligible.
>
> And it will continue to be with this patch, on CPUs with
> hyper fast C1 exit.
>
> Which makes me confused about what your are objecting to,
> since the system should continue to be have the way you want,
> with the patch applied.

Ok, I don't object the correctness of your patch but the reasoning 
behind this small optimization which bring us a lot of mess in the 
cpuidle code.

As you are touching this part of the code, I take the opportunity to 
raise a discussion about it.

 From my POV, the poll state is *not* an idle state. It is like a 
vehicle burnout [1].

But it is inserted into the idle state tables using a trick with a macro 
CPUIDLE_DRIVER_STATE_START which already led us to some bugs.

So instead of falling back into the poll state under certain 
circumstances, I propose we extract this state from the idle state table 
and we let the menu governor to fail choosing a state (or not).

 From the caller, we decide what to do (poll or C1) if the idle state 
selection fails or we choose to poll *before* like what we already have 
in kernel/sched/idle.c:

in the idle loop:

if (cpu_idle_force_poll || tick_check_broadcast_expired())
	cpu_idle_poll();
else
	cpuidle_idle_call();

By this way, we:

1) factor out the idle state selection with the find_deepest_idle_state
2) remove the CPUIDLE_DRIVER_STATE_START macro
3) concentrate the optimization logic outside of a governor which will 
benefit to all architectures

Does it make sense ?

   -- Daniel

[1] https://en.wikipedia.org/wiki/Burnout_%28vehicle%29


-- 
  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/