[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c62985530812220728v49a6da6hc2057032bc4bcb19@mail.gmail.com>
Date: Mon, 22 Dec 2008 16:28:26 +0100
From: "Frédéric Weisbecker" <fweisbec@...il.com>
To: "Cyrill Gorcunov" <gorcunov@...il.com>
Cc: "Ingo Molnar" <mingo@...e.hu>,
"Thomas Gleixner" <tglx@...utronix.de>,
"Linux Kernel" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging
2008/12/22 Cyrill Gorcunov <gorcunov@...il.com>:
> [Frederic Weisbecker - Mon, Dec 22, 2008 at 02:24:48AM +0100]
> | Impact: avoid hanging on slow systems
> |
> | While using the function graph tracer on a virtualized system, the hrtimer_interrupt
> | can hang the system on an infinite loop.
> | This can be caused on several situation where something intrusive is slowing the
> | system (ie: tracing) and the next clock events to program are always before the current
> | time.
> | This patch implements a reasonable compromise. If such a situation is detected, we share
> | the CPUs time in 1/4 to process the hrtimer interrupts. This is enough to let the system
> | running without serious starvation.
> |
> | It has been successfully tested under VirtualBox with 1000 HZ and 100 HZ with function graph
> | tracer launched. On both cases, the clock events were increased until about 25 ms periodic ticks,
> | which means 40 HZ.
> |
> | Signed-off-by: Frederic Weisbecker <fweisbec@...il.com>
> | Cc: Thomas Gleixner <tglx@...utronix.de>
> | ---
> | kernel/hrtimer.c | 30 +++++++++++++++++++++++++++++-
> | 1 files changed, 29 insertions(+), 1 deletions(-)
> |
> | diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> | index bda9cb9..02f2477 100644
> | --- a/kernel/hrtimer.c
> | +++ b/kernel/hrtimer.c
> | @@ -1171,6 +1171,29 @@ static void __run_hrtimer(struct hrtimer *timer)
> |
> | #ifdef CONFIG_HIGH_RES_TIMERS
> |
> | +static int force_clock_reprogram;
> | +
> | +/*
> | + * After 5 iteration's attempts, we consider that hrtimer_interrupt()
> | + * is hanging, which could happen with something that slows the interrupt
> | + * such as the tracing. Then we force the clock reprogramming for each future
> | + * hrtimer interrupts to avoid infinite loops and use the min_delta_ns
> | + * threshold that we will overwrite.
> | + * The next tick event will be scheduled to 3 times we currently spend on
> | + * hrtimer_interrupt(). This gives a good compromise, the cpus will spend
> | + * 1/4 of their time to process the hrtimer interrupts. This is enough to
> | + * let it running without serious starvation.
> | + */
> | +
> | +static inline void
> | +hrtimer_interrupt_hanging(struct clock_event_device *dev,
> | + ktime_t try_time)
> | +{
> | + force_clock_reprogram = 1;
> | + dev->min_delta_ns = (unsigned long)try_time.tv64 * 3;
> | + printk(KERN_WARNING "hrtimer: interrupt too slow, "
> | + "forcing clock min delta to %lu ns\n", dev->min_delta_ns);
> | +}
> | /*
> | * High resolution timer interrupt
> | * Called with interrupts disabled
> | @@ -1180,6 +1203,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
> | struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
> | struct hrtimer_clock_base *base;
> | ktime_t expires_next, now;
> | + int nr_retries = 0;
> | int i;
> |
> | BUG_ON(!cpu_base->hres_active);
> | @@ -1187,6 +1211,10 @@ void hrtimer_interrupt(struct clock_event_device *dev)
> | dev->next_event.tv64 = KTIME_MAX;
> |
> | retry:
> | + /* 5 retries is enough to notice a hang */
> | + if (!(++nr_retries % 5))
> | + hrtimer_interrupt_hanging(dev, ktime_sub(ktime_get(), now));
> | +
> | now = ktime_get();
>
> Hi Frederic,
>
> is it really needed to use mod operation here?
This is a kind of paranoid check.
But actually you are right, it is not necessary.
If we force the clock reprogramming, we will not retry again...
> Why cant we test for plain 5 and flush it to zero then?
> I mean something like
>
> if (++nr_retries > 5) {
> nr_retries = 0;
> ...
> }
>
> Did I miss anything?
>
> - Cyrill -
>
Since the clock reprogramming can't fail anymore after that, we can
just check ++nr_retries == 5 or why not
++nr_retries >= 5 if we want to stay paranoid......
I will fix it if the patched is accepted...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists