[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130917082649.GE20661@gmail.com>
Date: Tue, 17 Sep 2013 10:26:49 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: hpa@...or.com, linux-kernel@...r.kernel.org,
gerlando.falauto@...mile.com, john.stultz@...aro.org,
minggr@...il.com, tglx@...utronix.de,
linux-tip-commits@...r.kernel.org, lttng-dev@...ts.lttng.org
Subject: Re: [tip:timers/urgent] timekeeping: Fix HRTICK related deadlock
from ntp lock changes
* Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
> * Ingo Molnar (mingo@...nel.org) wrote:
> >
> > * Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
> >
> > > Hi Ingo,
> > >
> > > Do you have an estimate of the time it will take for this fix to hit
> > > mainline, stable-3.10 and stable-3.11 ? Meanwhile, I'm marking 3.10 and
> > > 3.11 as broken for LTTng with a kernel version at compile-time, since
> > > this kernel regression currently triggers hard system lockup when people
> > > use LTTng on those kernels, and this is certainly something nobody
> > > wants.
> >
> > So, at least as per the description of John, this should only trigger if
> > SCHED_HRTICK is enabled in sched_features - which is disabled by default,
> > it's a debug-only development feature. Does the bug trigger on more
> > regular kernels as well?
>
> Unfortunately, it does happen on a pretty standard kernel config (giving
> my x230 config as example below). Pasting relevant bug description from
> http://bugs.lttng.org/issues/631 :
>
> "Starting from Linux kernel commit
> 06c017fdd4dc48451a29ac37fc1db4a3f86b7f40 "timekeeping: Hold
> timekeepering locks in do_adjtimex and hardpps" (3.10 kernels), the
> xtime write seqlock is held across calls to __do_adjtimex(), which
> includes a call to notify_cmos_timer(), and hence
> schedule_delayed_work().
>
> This introduces a side-effect for a set of tracepoints, including mainly
> the workqueue tracepoints: a tracer hooking on those tracepoints and
> reading current time with ktime_get() will cause hard system LOCKUP"
It's the LTTng tracepoint 'hooking' in something that does something
invalid in that context that is causing the hang, not the vanilla kernel
itself, right?
In that case the 'you get to keep both pieces' policy of out of tree code
applies - but the HRTICK fix should solve your problem as well,
incidentally.
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists