linux-kernel - Re: [PATCH] hrtimer: increase clock min delta threshold while interrupt hanging

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20081222070044.GC29160@elte.hu>
Date:	Mon, 22 Dec 2008 08:00:44 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Frans Pop <elendil@...net.nl>
Cc:	Frederic Weisbecker <fweisbec@...il.com>, tglx@...utronix.de,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] hrtimer: increase clock min delta threshold while
	interrupt hanging

* Frans Pop <elendil@...net.nl> wrote:

> > Impact: avoid hanging on slow systems
> >
> > While using the function graph tracer on a virtualized system, the 
> > hrtimer_interrupt can hang the system on an infinite loop. This can be 
> > caused on several situation where something intrusive is slowing the 
> > system (ie: tracing) and the next clock events to program are always 
> > before the current time. This patch implements a reasonable 
> > compromise. If such a situation is detected, we share the CPUs time in 
> > 1/4 to process the hrtimer interrupts. This is enough to let the 
> > system running without serious starvation.
> 
> Should there maybe also be a mechanism to allow the system to 
> automatically "recover" to higher (the original?) clockfrequencies, for 
> example if the danger of loops has passed after tracing has been 
> disabled?

I dont think that's necessary - tick_dev_program_event() already includes 
a gradual approach that increases the 'min delta' interval step by step - 
so we should find the 'system is limping along' compromise pretty 
accurately.

A system can get "magically faster" later on (if we turn off tracing or 
other kernel features that impact the cost of the timer tick), and we 
might not need those safety measures anymore - but here the real solution 
will be to make the virtualizer faster. Taking milliseconds to process a 
timer tick (be that under tracing or not) is rather slow. So the kernel 
has applied the brakes and has printed a warning about it - we should do 
no more.

> > +static int force_clock_reprogram;
> 
> Shouldn't this be initialized to 0?

no - global or static scope variables are initialized to zero in C.

> > @@ -1239,7 +1267,7 @@ void hrtimer_interrupt(struct clock_event_device *dev) 
> >         /* Reprogramming necessary ? */
> >         if (expires_next.tv64 != KTIME_MAX) {
> > -               if (tick_program_event(expires_next, 0))
> > +               if (tick_program_event(expires_next, force_clock_reprogram))
> > 			goto retry; 
> >         }
> >  }
> 
> Shouldn't force_clock_reprogram be reset to 0 after it has fired and 
> been handled?

hm, that would be interesting to see - in theory the system should become 
stable after a few iterations of increasing min_delta. Frederic, is the 
system still workable if you try what Frans has suggested?

Also, there's min_delta doubling in tick_dev_program_event() itself too - 
that interacts with the irq-overload logic:

+       dev->min_delta_ns = (unsigned long)try_time.tv64 * 3;

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/