[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081222070044.GC29160@elte.hu>
Date: Mon, 22 Dec 2008 08:00:44 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Frans Pop <elendil@...net.nl>
Cc: Frederic Weisbecker <fweisbec@...il.com>, tglx@...utronix.de,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] hrtimer: increase clock min delta threshold while
interrupt hanging
* Frans Pop <elendil@...net.nl> wrote:
> > Impact: avoid hanging on slow systems
> >
> > While using the function graph tracer on a virtualized system, the
> > hrtimer_interrupt can hang the system on an infinite loop. This can be
> > caused on several situation where something intrusive is slowing the
> > system (ie: tracing) and the next clock events to program are always
> > before the current time. This patch implements a reasonable
> > compromise. If such a situation is detected, we share the CPUs time in
> > 1/4 to process the hrtimer interrupts. This is enough to let the
> > system running without serious starvation.
>
> Should there maybe also be a mechanism to allow the system to
> automatically "recover" to higher (the original?) clockfrequencies, for
> example if the danger of loops has passed after tracing has been
> disabled?
I dont think that's necessary - tick_dev_program_event() already includes
a gradual approach that increases the 'min delta' interval step by step -
so we should find the 'system is limping along' compromise pretty
accurately.
A system can get "magically faster" later on (if we turn off tracing or
other kernel features that impact the cost of the timer tick), and we
might not need those safety measures anymore - but here the real solution
will be to make the virtualizer faster. Taking milliseconds to process a
timer tick (be that under tracing or not) is rather slow. So the kernel
has applied the brakes and has printed a warning about it - we should do
no more.
> > +static int force_clock_reprogram;
>
> Shouldn't this be initialized to 0?
no - global or static scope variables are initialized to zero in C.
> > @@ -1239,7 +1267,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
> > /* Reprogramming necessary ? */
> > if (expires_next.tv64 != KTIME_MAX) {
> > - if (tick_program_event(expires_next, 0))
> > + if (tick_program_event(expires_next, force_clock_reprogram))
> > goto retry;
> > }
> > }
>
> Shouldn't force_clock_reprogram be reset to 0 after it has fired and
> been handled?
hm, that would be interesting to see - in theory the system should become
stable after a few iterations of increasing min_delta. Frederic, is the
system still workable if you try what Frans has suggested?
Also, there's min_delta doubling in tick_dev_program_event() itself too -
that interacts with the irq-overload logic:
+ dev->min_delta_ns = (unsigned long)try_time.tv64 * 3;
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists