lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.1.00.0803221530290.3781@apollo.tec.linutronix.de>
Date:	Sat, 22 Mar 2008 15:41:01 +0100 (CET)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Andi Kleen <andi@...stfloor.org>
cc:	Gabriel C <crazy@...galware.org>,
	Gabriel C <nix.or.die@...glemail.com>,
	"Rafael J. Wysocki" <rjw@...k.pl>,
	LKML <linux-kernel@...r.kernel.org>,
	Adrian Bunk <bunk@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Natalie Protasevich <protasnb@...il.com>,
	andi-bz@...stfloor.org, Ingo Molnar <mingo@...e.hu>
Subject: Re: 2.6.25-rc5-git6: Reported regressions from 2.6.24

On Sat, 22 Mar 2008, Andi Kleen wrote:
> > CPU0 runs the watchdog timer and schedules it on CPU1.
> > 
> > With NO_HZ enabled CPU1 is in a long idle sleep. At this point of the
> > boot process there is probably no timer pending on CPU1, which means
> > the idle sleep is infinite.
> > 
> > Now some time later CPU1 gets woken by an interrupt/IPI and runs the
> > timer wheel. At this point the pm_timer which is the reference clock
> > has already wrapped around, so the watchdog thinks that there is a
> 
> In my old original own noidletick code I simply limited all sleeps 
> to below the wrap around of the primary timer.  Wouldn't something 
> like that work?

No, it does not solve the real problem of not reevaluating the timer
wheel on the idle CPU when a timer gets added from some other CPU. We
would paper over the watchdog issue, but postponing a timer event,
which was added cross CPU to some artifical expiry time is simply
wrong.

> I'm not sure just doing this for add_timer_on() only is correct.
> After all it could affect any other code not run by add_timer_on()
> couldn't it?

No, it's limited to add_timer_on() simply because no other code can
add a new timer (timer_list or hrtimer) which modifies the next event
on another CPU. There is also the rare case, when one CPU runs the
timer callback and the other one modifies the timer, but that's not
relevant for the NOHZ problem because the CPU which runs the callback
is not idle at this point.

All other timer operations are CPU local and reevaluated before the
CPU goes idle again.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ