[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.02.1404301207270.6261@ionos.tec.linutronix.de>
Date: Wed, 30 Apr 2014 12:35:51 +0200 (CEST)
From: Thomas Gleixner <tglx@...utronix.de>
To: Stuart Hayes <stuart.w.hayes@...il.com>
cc: linux-kernel@...r.kernel.org
Subject: Re: [PATCH] hrtimer: invalid timeout set after hang_detected
Stuart,
On Tue, 29 Apr 2014, Stuart Hayes wrote:
> Make hrtimer_force_reprogram() not reprogram the clock event device if hang_detected has been set in hrtimer_interrupt().
>
Please use proper line breaks for the changelog.
> This can occur, for instance, if a CPU goes idle and calls
> tick_nohz_stop_sched_tick() after hang_detected is set. The
> function tick_nohz_stop_sched_tick() will call hrtimer_start() to
> reprogram the sched_timer to a longer timeout. hrtimer_start() will
> call __hrtimer_start_range_ns(), which first calls remove_hrtimer()
> to remove sched_timer, then hrtimer_enqueue_reprogram() to add it
> with its new timeout. The problem is that remove_hrtimer() calls
> __remove_hrtimer(), which calls hrtimer_force_reprogram(), and
> hrtimer_force_reprogram() ignores hang_detected and will reprogram
> the clock event device to the next soonest hrtimer expiry, which
> could be, say, 11 seconds away. This overwrites the value that was
> programmed into the clock event device when hang_detected was set
> (which was no more than 100ms). Then hrtimer_enqueue_reprogram()
> calls hrtimer_reprogram(), which observes hang_detected and does not
> reprogram the clock event device, so the device remains set to the
> val ue of, in this example, 11 seconds, during which time no clock
> event device interrupts occur and no timer expiration functions are
> run.
I took the liberty to rewrite the changelog. Please have a look and
judge yourself whats easier to grasp.
>
> Signed-off-by: Stuart Hayes <stuart.w.hayes@...il.com>
> ---
>
> --- linux-3.15-rc3/kernel_orig/hrtimer.c 2014-04-29 13:10:58.087832963 -0400
> +++ linux-3.15-rc3/kernel/hrtimer.c 2014-04-29 15:42:49.581084736 -0400
> @@ -569,6 +569,15 @@ hrtimer_force_reprogram(struct hrtimer_c
>
> cpu_base->expires_next.tv64 = expires_next.tv64;
>
> + /*
> + * If a hang was detected in the last timer interrupt then we
> + * do not schedule a timer which is earlier than the expiry
> + * which we enforced in the hang detection. We want the system
> + * to make progress.
So you just blindly copied the comment from hrtimer_reprogram(). But
this is not only about scheduling an timer which is earlier.
Fixed it up, but please be more careful when you submit patches next
time.
Thanks,
tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists