[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180907094148.2444acc6@gandalf.local.home>
Date: Fri, 7 Sep 2018 09:41:48 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Petr Mladek <pmladek@...e.com>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On Fri, 7 Sep 2018 09:34:48 +0200
Peter Zijlstra <peterz@...radead.org> wrote:
> On Wed, Sep 05, 2018 at 09:33:34PM -0400, Steven Rostedt wrote:
> > do_idle {
> >
> > [interrupts enabled]
> >
> > <interrupt> [interrupts disabled]
> > TRACE_IRQS_OFF [lockdep says irqs off]
> > [...]
> > TRACE_IRQS_IRET
> > test if pt_regs say return to interrupts enabled [yes]
> > TRACE_IRQS_ON [lockdep says irqs are on]
> >
> > <nmi>
> > nmi_enter() {
> > printk_nmi_enter() [traced by ftrace]
> > [ hit ftrace breakpoint ]
> > <breakpoint exception>
> > TRACE_IRQS_OFF [lockdep says irqs off]
> > [...]
> > TRACE_IRQS_IRET [return from breakpoint]
> > test if pt_regs say interrupts enabled [no]
> > [iret back to interrupt]
> > [iret back to code]
> >
> > tick_nohz_idle_enter() {
> >
> > lockdep_assert_irqs_enabled() [lockdep say no!]
>
> Isn't the problem that we muck with the IRQ state from NMI context? We
> shouldn't be doing that.
Not really. The problem is that the exception handling code jumps into
lockdep, and we allow exceptions to happen in NMIs. It's just that the
tracing of printk_nmi_enter() caused an exception before lockdep was
turned off.
>
> The thing is, since we trace the IRQ state from within IRQ-disable,
> since that's the only IRQ-safe option, it is very much not NMI-safe.
>
> Your patch might avoid the symptom, but I don't think it cures the
> fundamental problem.
No, the issue is that the nmi_enter() code was traced early which adds
breakpoints and breakpoints causes exceptions which jumps into lockdep,
and we did this before lockdep was turned off.
-- Steve
Powered by blists - more mailing lists