[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6DB365C6-98F6-4C27-B0BE-0833E5D4962E@akamai.com>
Date: Fri, 16 Nov 2018 18:46:36 +0000
From: "Zhivich, Michael" <mzhivich@...mai.com>
To: John Stultz <john.stultz@...aro.org>
CC: lkml <linux-kernel@...r.kernel.org>,
"tiny.windzz@...il.com" <tiny.windzz@...il.com>,
Joel Fernandes <joel@...lfernandes.org>,
"alexander.levin@...izon.com" <alexander.levin@...izon.com>,
"frederic@...nel.org" <frederic@...nel.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Ingo Molnar <mingo@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Thomas Gleixner <tglx@...utronix.de>,
Arnd Bergmann <arnd@...db.de>,
"Ondrej Mosnacek" <omosnace@...hat.com>,
Jason Wessel <jason.wessel@...driver.com>,
"kreview@...mai.com" <kreview@...mai.com>
Subject: Re: [PATCH] softirq: don't push timer softirq handling to ksoftirqd
On 11/15/18, 12:17 PM, "John Stultz" <john.stultz@...aro.org> wrote:
On Thu, Nov 15, 2018 at 9:07 AM, Michael Zhivich <mzhivich@...mai.com> wrote:
> Require TIMER_SOFTIRQ to be handled immediately instead of delaying until
> ksoftirqd runs, thus preventing problems with reading clocksources that
> wrap often (e.g. acpi_pm).
>
> If acpi_pm is used as the clocksource watchdog, and machine is under heavy
> load, the time period for the watchdog check may be significantly longer
> than the requested 0.5 seconds. If the watchdog check is delayed by 2
> seconds (observed behavior), then acpi_pm time delta will be
>
> 2.5 sec * 3579545 ticks/sec = 8948863 = 0x888c3f
>
> which will be treated as negative (since acpi_pm is only 24-bits wide) and
> truncated to 0. This behavior will cause tsc to be incorrectly declared
> unstable in clocksource_watchdog(), as it no longer agrees with acpi_pm.
> If the clocksource watchdog check is delayed by more than 4.7 sec, then the
> acpi_pm clocksource will wrap altogether and produce incorrect time delta.
>
> The likely cause of this delay is that timer interrupts are serviced in
> ksoftirqd when the machine is very busy.
>
> Per Linus' comment in commit 3c53776e29f8 ("Mark HI and TASKLET softirq
> synchronous"):
> ...
> We should probably also consider the timer softirqs to be synchronous
> and not be delayed to ksoftirqd (since they were the issue with the
> earlier watchdog problems), but that should be done as a separate patch.
> ...
>
> Signed-off-by: Michael Zhivich <mzhivich@...mai.com>
> ---
> kernel/softirq.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index d28813306b2c..6d517ce0fba8 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -82,7 +82,8 @@ static void wakeup_softirqd(void)
> * right now. Let ksoftirqd handle this at its own rate, to get fairness,
> * unless we're doing some of the synchronous softirqs.
> */
> -#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
> +#define SOFTIRQ_NOW_MASK \
> + ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ) | (1 << TIMER_SOFTIRQ))
> static bool ksoftirqd_running(unsigned long pending)
> {
> struct task_struct *tsk = __this_cpu_read(ksoftirqd);
Thanks so much for sending this along! Sorry I didn't get back to your
mail earlier this week, I've been at Plumbers.
So while this does try to attack the reliability issue w/ the
clocksource watchdog being delayed, I worry this will have to many
side-effects elsewhere.
Would a more focused fix be to move the clocksource watchdog from a
normal timer to a hrtimer?
thanks
-john
Hi John,
That's an interesting idea - it would get clocksource watchdog out of ksoftirqd. However, clocksource watchdog iterates over available CPUs to check the TSC on each core (see add_timer_on() call in clocksource_watchdog()). I'm not seeing an API to start an hrtimer on a specific CPU - is this possible and I'm missing it? Or would something like this have to be added to hrtimer?
Thanks,
~ Michael
Powered by blists - more mailing lists