lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 10 Aug 2018 09:24:44 +0200
From:   "Rafael J. Wysocki" <rafael@...nel.org>
To:     Leo Yan <leo.yan@...aro.org>
Cc:     "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux PM <linux-pm@...r.kernel.org>
Subject: Re: [PATCH] sched: idle: Reenable sched tick for cpuidle request

On Fri, Aug 10, 2018 at 7:53 AM,  <leo.yan@...aro.org> wrote:
> On Thu, Aug 09, 2018 at 11:31:46PM +0200, Rafael J . Wysocki wrote:
>
> [...]
>
>> > >> And I really would prefer to avoid restarting the tick here, because
>> > >> it is overhead and quite likely unnecessary.
>> > >
>> > > I understand the logic when read the code, actually I did some experiments
>> > > on the function menu_select(), in menu_select() function it discards the
>> > > consideration for typical pattern interval and it also tries to avoid to
>> > > enable tick and select more shallow state at the bottom of function.  So I
>> > > agree that in the middle of idles it's redundant to reenable tick and the
>> > > code is careful thought.
>> > >
>> > > But this patch tries to rescue the case at the last time the CPU enter one
>> > > shallow idle state but without wake up event.
>> >
>> > It is better to avoid entering a shallow state IMO.  Let me think
>> > about that a bit.
>>
>> The simple change below should address this issue and I don't quite see
>> what it can break.  It may cause deeper idle states to be selected with
>> the tick already stopped, but that really shouldn't be problematic, as
>> (since the tick has been stopped) there are no strict latency constraints,
>> so even if there is an early wakeup, we should be able to tolerate the
>> extra latency just fine.
>>
>> ---
>>  drivers/cpuidle/governors/menu.c |   10 ++++------
>>  1 file changed, 4 insertions(+), 6 deletions(-)
>>
>> Index: linux-pm/drivers/cpuidle/governors/menu.c
>> ===================================================================
>> --- linux-pm.orig/drivers/cpuidle/governors/menu.c
>> +++ linux-pm/drivers/cpuidle/governors/menu.c
>> @@ -349,14 +349,12 @@ static int menu_select(struct cpuidle_dr
>>                * If the tick is already stopped, the cost of possible short
>>                * idle duration misprediction is much higher, because the CPU
>>                * may be stuck in a shallow idle state for a long time as a
>> -              * result of it.  In that case say we might mispredict and try
>> -              * to force the CPU into a state for which we would have stopped
>> -              * the tick, unless a timer is going to expire really soon
>> -              * anyway.
>> +              * result of it.  In that case say we might mispredict and use
>> +              * the known time to the closest timer event for the idle state
>> +              * selection.
>>                */
>>               if (data->predicted_us < TICK_USEC)
>> -                     data->predicted_us = min_t(unsigned int, TICK_USEC,
>> -                                                ktime_to_us(delta_next));
>> +                     data->predicted_us = ktime_to_us(delta_next);
>
> I did the testing on this, but above change cannot really resolve the
> issue, it misses to handle the case if 'data->predicted_us > TICK_USEC';
> if the prediction is longer than TICK_USEC, e.g. data->predicted_us is
> 2ms, TICK_USEC=1ms;  for this case the deepest state will not be
> chosen and if the data->predicted_us is decided by typical pattern
> value but not the closest timer, finally the CPU still might stay in
> shallow state for long time.

I noticed that too in the meantime. :-)

> Actually in the CPU idle loop with the tick is stopped, I think we
> should achieve two targets:
> - Ensure the CPU can enter the deepest idle state at the last time it
>   runs into into idle;
> - In the middle of idles, we will not reenable the tick at all; though
>   the idle states can be chosen a shallow state for short prediction;
>
> To achieve the first target, we need to define what's the possible
> case the CPU might stay into shallow state but cannot be waken up in
> short time; so for this purpose it's pointeless to compare the value
> between 'data->predicted_us' and TICK_USEC, so I'd like to check if
> the next timer event is reliable to wake up CPU in short time, this
> can be finished by comparison between 'ktime_to_us(delta_next)' with
> maximum target residency;
>
> For the second target, we should not enable the tick again in the idle
> loop after the tick is stopped, whatever the governor choose any idle
> state.
>
> So how about below changes?  I did some verify on this.

I have a similar, but somewhat different patch.  I'll post it shortly.

>
> diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
> index 30ab759..e2de7c2 100644
> --- a/drivers/cpuidle/governors/menu.c
> +++ b/drivers/cpuidle/governors/menu.c
> @@ -351,18 +351,21 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
>         data->predicted_us = min(data->predicted_us, expected_interval);
>
>         if (tick_nohz_tick_stopped()) {
> +               unsigned int delta_next_us = ktime_to_us(delta_next);
> +
>                 /*
>                  * If the tick is already stopped, the cost of possible short
>                  * idle duration misprediction is much higher, because the CPU
>                  * may be stuck in a shallow idle state for a long time as a
> -                * result of it.  In that case say we might mispredict and try
> -                * to force the CPU into a state for which we would have stopped
> -                * the tick, unless a timer is going to expire really soon
> -                * anyway.
> +                * result of it.  If the next timer event is later than the
> +                * maximum target residency, this means the timer event is not
> +                * trusted to wake up CPU in short term and the typical pattern
> +                * interval or other factors might lead to a shallow state, in
> +                * that case say we might mispredict and use the known time to
> +                * the closest timer event for the idle state selection.
>                  */
> -               if (data->predicted_us < TICK_USEC)
> -                       data->predicted_us = min_t(unsigned int, TICK_USEC,
> -                                                  ktime_to_us(delta_next));
> +               if (delta_next_us >= drv->states[drv->state_count-1].target_residency)
> +                       data->predicted_us = delta_next_us;
>         } else {
>                 /*
>                  * Use the performance multiplier and the user-configurable
> @@ -410,12 +413,12 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
>          * expected idle duration is shorter than the tick period length.
>          */
>         if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) ||
> -           expected_interval < TICK_USEC) {
> +           (!tick_nohz_tick_stopped() && expected_interval < TICK_USEC)) {
>                 unsigned int delta_next_us = ktime_to_us(delta_next);
>
>                 *stop_tick = false;
>
> -               if (!tick_nohz_tick_stopped() && idx > 0 &&
> +               if (idx > 0 &&
>                     drv->states[idx].target_residency > delta_next_us) {
>                         /*
>                          * The tick is not going to be stopped and the target

Powered by blists - more mailing lists