[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210710005243.GA23956@lothringen>
Date: Sat, 10 Jul 2021 02:52:43 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: Nicolas Saenz Julienne <nsaenzju@...hat.com>
Cc: He Zhe <zhe.he@...driver.com>, anna-maria@...utronix.de,
linux-kernel@...r.kernel.org, tglx@...utronix.de
Subject: Re: [PATCH] timers: Fix get_next_timer_interrupt() with no timers
pending
On Fri, Jul 09, 2021 at 04:13:25PM +0200, Nicolas Saenz Julienne wrote:
> 31cd0e119d50 ("timers: Recalculate next timer interrupt only when
> necessary") subtly altered get_next_timer_interrupt()'s behaviour. The
> function no longer consistently returns KTIME_MAX with no timers
> pending.
>
> In order to decide if there are any timers pending we check whether the
> next expiry will happen NEXT_TIMER_MAX_DELTA jiffies from now.
> Unfortunately, the next expiry time and the timer base clock are no
> longer updated in unison. The former changes upon certain timer
> operations (enqueue, expire, detach), whereas the latter keeps track of
> jiffies as they move forward. Ultimately breaking the logic above.
>
> A simplified example:
>
> - Upon entering get_next_timer_interrupt() with:
>
> jiffies = 1
> base->clk = 0;
> base->next_expiry = NEXT_TIMER_MAX_DELTA;
>
> 'base->next_expiry == base->clk + NEXT_TIMER_MAX_DELTA', the function
> returns KTIME_MAX.
>
> - 'base->clk' is updated to the jiffies value.
>
> - The next time we enter get_next_timer_interrupt(), taking into account
> no timer operations happened:
>
> base->clk = 1;
> base->next_expiry = NEXT_TIMER_MAX_DELTA;
>
> 'base->next_expiry != base->clk + NEXT_TIMER_MAX_DELTA', the function
> returns a valid expire time, which is incorrect.
>
> This ultimately might unnecessarily rearm sched's timer on nohz_full
> setups, and add latency to the system[1].
>
> So, introduce 'base->timers_pending'[2], update it every time
> 'base->next_expiry' changes, and use it in get_next_timer_interrupt().
>
> [1] See tick_nohz_stop_tick().
> [2] A quick pahole check on x86_64 and arm64 shows it doesn't make
> 'struct timer_base' any bigger.
>
> Fixes: 31cd0e119d50 ("timers: Recalculate next timer interrupt only when necessary")
> Signed-off-by: Nicolas Saenz Julienne <nsaenzju@...hat.com>
Very good catch.
And the fix looks good:
Acked-by: Frederic Weisbecker <frederic@...nel.org>
I guess later we can turn this .timers_pending into
.timers_count and that would spare us the costly call to
__next_timer_interrupt() up to the last level after the last
timer is dequeued.
Anyway, thanks a lot!
Powered by blists - more mailing lists