linux-kernel - Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for short idle periods

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e116a6d7-f26f-44e8-e56c-7f422e12d2c0@linux.intel.com>
Date:   Wed, 12 Jul 2017 13:22:38 +0800
From:   "Li, Aubrey" <aubrey.li@...ux.intel.com>
To:     paulmck@...ux.vnet.ibm.com
Cc:     Frederic Weisbecker <fweisbec@...il.com>,
        Aubrey Li <aubrey.li@...el.com>, tglx@...utronix.de,
        peterz@...radead.org, len.brown@...el.com, rjw@...ysocki.net,
        ak@...ux.intel.com, tim.c.chen@...ux.intel.com,
        arjan@...ux.intel.com, yang.zhang.wz@...il.com, x86@...nel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v1 04/11] sched/idle: make the fast idle path for
 short idle periods



On 2017/7/12 13:03, Paul E. McKenney wrote:
> On Wed, Jul 12, 2017 at 11:19:59AM +0800, Li, Aubrey wrote:
>> On 2017/7/12 2:11, Paul E. McKenney wrote:
>>> On Tue, Jul 11, 2017 at 06:33:55PM +0200, Frederic Weisbecker wrote:
>>>> On Tue, Jul 11, 2017 at 05:58:47AM -0700, Paul E. McKenney wrote:
>>>>> On Mon, Jul 10, 2017 at 09:38:34AM +0800, Aubrey Li wrote:
>>>>>> From: Aubrey Li <aubrey.li@...ux.intel.com>
>>>>>>
>>>>>> The system will enter a fast idle loop if the predicted idle period
>>>>>> is shorter than the threshold.
>>>>>> ---
>>>>>>  kernel/sched/idle.c | 9 ++++++++-
>>>>>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
>>>>>> index cf6c11f..16a766c 100644
>>>>>> --- a/kernel/sched/idle.c
>>>>>> +++ b/kernel/sched/idle.c
>>>>>> @@ -280,6 +280,8 @@ static void cpuidle_generic(void)
>>>>>>   */
>>>>>>  static void do_idle(void)
>>>>>>  {
>>>>>> +	unsigned int predicted_idle_us;
>>>>>> +	unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2;
>>>>>>  	/*
>>>>>>  	 * If the arch has a polling bit, we maintain an invariant:
>>>>>>  	 *
>>>>>> @@ -291,7 +293,12 @@ static void do_idle(void)
>>>>>>
>>>>>>  	__current_set_polling();
>>>>>>
>>>>>> -	cpuidle_generic();
>>>>>> +	predicted_idle_us = cpuidle_predict();
>>>>>> +
>>>>>> +	if (likely(predicted_idle_us < short_idle_threshold))
>>>>>> +		cpuidle_fast();
>>>>>
>>>>> What if we get here from nohz_full usermode execution?  In that
>>>>> case, if I remember correctly, the scheduling-clock interrupt
>>>>> will still be disabled, and would have to be re-enabled before
>>>>> we could safely invoke cpuidle_fast().
>>>>>
>>>>> Or am I missing something here?
>>>>
>>>> That's a good point. It's partially ok because if the tick is needed
>>>> for something specific, it is not entirely stopped but programmed to that
>>>> deadline.
>>>>
>>>> Now there is some idle specific code when we enter dynticks-idle. See
>>>> tick_nohz_start_idle(), tick_nohz_stop_idle(), sched_clock_idle_wakeup_event()
>>>> and some subsystems that react differently when we enter dyntick idle
>>>> mode (scheduler_tick_max_deferment) so the tick may need a reevaluation.
>>>>
>>>> For now I'd rather suggest that we treat full nohz as an exception case here
>>>> and do:
>>>>
>>>>     if (!tick_nohz_full_cpu(smp_processor_id()) && likely(predicted_idle_us < short_idle_threshold))
>>>>         cpuidle_fast();
>>>>
>>>> Ugly but safer!
>>>
>>> Works for me!
>>
>> I guess who enabled full nohz(for example the financial guys who need the system
>> response as fast as possible) does not like this compromise, ;)
> 
> And some HPC guys and some real-time guys with CPU-bound real-time
> processing, so there are likely quite a few different views on this
> compromise.
> 
>> How about add rcu_idle enter/exit back only for full nohz case in fast idle? RCU idle
>> is the only risky ops if removing them from fast idle path. Comparing to adding RCU
>> idle back, going to normal idle path has more overhead IMHO.
> 
> That might work, but I would need to see the actual patch.  Frederic
> Weisbecker should look at it as well.
> 
Okay, let me address the first round of comments and deliver v2 soon.

Thanks,
-Aubrey