lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 17 Sep 2013 10:26:49 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc:	hpa@...or.com, linux-kernel@...r.kernel.org,
	gerlando.falauto@...mile.com, john.stultz@...aro.org,
	minggr@...il.com, tglx@...utronix.de,
	linux-tip-commits@...r.kernel.org, lttng-dev@...ts.lttng.org
Subject: Re: [tip:timers/urgent] timekeeping: Fix HRTICK related deadlock
 from ntp lock changes


* Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:

> * Ingo Molnar (mingo@...nel.org) wrote:
> > 
> > * Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
> > 
> > > Hi Ingo,
> > > 
> > > Do you have an estimate of the time it will take for this fix to hit 
> > > mainline, stable-3.10 and stable-3.11 ? Meanwhile, I'm marking 3.10 and 
> > > 3.11 as broken for LTTng with a kernel version at compile-time, since 
> > > this kernel regression currently triggers hard system lockup when people 
> > > use LTTng on those kernels, and this is certainly something nobody 
> > > wants.
> > 
> > So, at least as per the description of John, this should only trigger if 
> > SCHED_HRTICK is enabled in sched_features - which is disabled by default, 
> > it's a debug-only development feature. Does the bug trigger on more 
> > regular kernels as well?
> 
> Unfortunately, it does happen on a pretty standard kernel config (giving
> my x230 config as example below). Pasting relevant bug description from
> http://bugs.lttng.org/issues/631 :
> 
> "Starting from Linux kernel commit
> 06c017fdd4dc48451a29ac37fc1db4a3f86b7f40 "timekeeping: Hold
> timekeepering locks in do_adjtimex and hardpps" (3.10 kernels), the
> xtime write seqlock is held across calls to __do_adjtimex(), which
> includes a call to notify_cmos_timer(), and hence
> schedule_delayed_work().
> 
> This introduces a side-effect for a set of tracepoints, including mainly 
> the workqueue tracepoints: a tracer hooking on those tracepoints and 
> reading current time with ktime_get() will cause hard system LOCKUP"

It's the LTTng tracepoint 'hooking' in something that does something 
invalid in that context that is causing the hang, not the vanilla kernel 
itself, right?

In that case the 'you get to keep both pieces' policy of out of tree code 
applies - but the HRTICK fix should solve your problem as well, 
incidentally.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ