lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20130917163303.GA10491@Krystal>
Date:	Tue, 17 Sep 2013 12:33:03 -0400
From:	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	hpa@...or.com, linux-kernel@...r.kernel.org,
	gerlando.falauto@...mile.com, john.stultz@...aro.org,
	minggr@...il.com, tglx@...utronix.de,
	linux-tip-commits@...r.kernel.org, lttng-dev@...ts.lttng.org
Subject: Re: [tip:timers/urgent] timekeeping: Fix HRTICK related deadlock
	from ntp lock changes

* Ingo Molnar (mingo@...nel.org) wrote:
> 
> * Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
> 
> > * Ingo Molnar (mingo@...nel.org) wrote:
> > > 
> > > * Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
> > > 
> > > > Hi Ingo,
> > > > 
> > > > Do you have an estimate of the time it will take for this fix to hit 
> > > > mainline, stable-3.10 and stable-3.11 ? Meanwhile, I'm marking 3.10 and 
> > > > 3.11 as broken for LTTng with a kernel version at compile-time, since 
> > > > this kernel regression currently triggers hard system lockup when people 
> > > > use LTTng on those kernels, and this is certainly something nobody 
> > > > wants.
> > > 
> > > So, at least as per the description of John, this should only trigger if 
> > > SCHED_HRTICK is enabled in sched_features - which is disabled by default, 
> > > it's a debug-only development feature. Does the bug trigger on more 
> > > regular kernels as well?
> > 
> > Unfortunately, it does happen on a pretty standard kernel config (giving
> > my x230 config as example below). Pasting relevant bug description from
> > http://bugs.lttng.org/issues/631 :
> > 
> > "Starting from Linux kernel commit
> > 06c017fdd4dc48451a29ac37fc1db4a3f86b7f40 "timekeeping: Hold
> > timekeepering locks in do_adjtimex and hardpps" (3.10 kernels), the
> > xtime write seqlock is held across calls to __do_adjtimex(), which
> > includes a call to notify_cmos_timer(), and hence
> > schedule_delayed_work().
> > 
> > This introduces a side-effect for a set of tracepoints, including mainly 
> > the workqueue tracepoints: a tracer hooking on those tracepoints and 
> > reading current time with ktime_get() will cause hard system LOCKUP"
> 
> It's the LTTng tracepoint 'hooking' in something that does something 
> invalid in that context that is causing the hang, not the vanilla kernel 
> itself, right?

Yes, that's correct. In order to ensure this kind of problem is entirely
taken care of, I've started working on a synchronization scheme proposed
by Peter Zijlstra that would allow ktime() to be called from any
execution context (see:
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg504089.html).

> 
> In that case the 'you get to keep both pieces' policy of out of tree code 
> applies - but the HRTICK fix should solve your problem as well, 
> incidentally.

Thanks,

Mathieu

> 
> Thanks,
> 
> 	Ingo

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ