[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131017135941.GC28963@localhost.localdomain>
Date: Thu, 17 Oct 2013 15:59:49 +0200
From: Frederic Weisbecker <fweisbec@...il.com>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
LKML <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...ux.intel.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Peter Zijlstra <peterz@...radead.org>,
"x86@...nel.org" <x86@...nel.org>,
"Wang, Xiaoming" <xiaoming.wang@...el.com>,
"Li, Zhuangzhi" <zhuangzhi.li@...el.com>,
"Liu, Chuansheng" <chuansheng.liu@...el.com>
Subject: Re: [PATCH] x86: Remove WARN_ON(in_nmi()) from vmalloc_fault
On Wed, Oct 16, 2013 at 12:36:32PM -0700, Paul E. McKenney wrote:
> On Wed, Oct 16, 2013 at 03:08:57PM +0200, Frederic Weisbecker wrote:
> > On Wed, Oct 16, 2013 at 08:45:18AM -0400, Steven Rostedt wrote:
> > > On Wed, 16 Oct 2013 13:40:37 +0200
> > > Frederic Weisbecker <fweisbec@...il.com> wrote:
> > >
> > > > On Tue, Oct 15, 2013 at 04:39:06PM -0400, Steven Rostedt wrote:
> > > > > Since the NMI iretq nesting has been fixed, there's no reason that
> > > > > an NMI handler can not take a page fault for vmalloc'd code. No locks
> > > > > are taken in that code path, and the software now handles nested NMIs
> > > > > when the fault re-enables NMIs on iretq.
> > > > >
> > > > > Not only that, if the vmalloc_fault() WARN_ON_ONCE() is hit, and that
> > > > > warn on triggers a vmalloc fault for some reason, then we can go into
> > > > > an infinite loop (the WARN_ON_ONCE() does the WARN() before updating
> > > > > the variable to make it happen "once").
> > > > >
> > > > > Reported-by: "Liu, Chuansheng" <chuansheng.liu@...el.com>
> > > > > Signed-off-by: Steven Rostedt <rostedt@...dmis.org>
> > > >
> > > > Thanks! For now we probably indeed want this patch. But I hope it's only
> > > > for the short term.
> > >
> > > Why?
> > >
> > > >
> > > > I still think that allowing faults in NMIs is very nasty, as we expect NMIs to never
> > > > be disturbed.
> > >
> > > We do faults (well, breakpoints really) in NMI to enable tracing.
> > >
> > > > I'm not even sure if that interacts correctly with the rcu_nmi_enter()
> > > > and preempt_count & NMI_MASK things. Not sure how perf is ready for that either (now
> > > > hardware events can be interrupted by fault trace events).
> > >
> > > I'm a bit confused. What doesn't interact correctly with
> > > rcu_nmi_enter()?
> >
> > Faults can call rcu_user_exit() / rcu_user_enter(). This is not supposed to happen
> > between rcu_nmi_enter() and rcu_nmi_exit(). rdtp->dynticks would be incremented in the
> > wrong way.
>
> I can attest to this! NMIs check for being nested within
> process/irq-based non-idle sojourns, but not the other way around.
> The result is that RCU will be ignoring you during that time, and not
> even disabling interrupts will save you. It will check rdtp->dynticks,
> see that its value is even, and register a quiescent state on behalf of
> the hapless CPU.
Fortunately, we are avoiding this with the in_interrupt() check on user_enter()
and user_exit(). Their goal is precisely to deal with traps/faults happening on
interrupts :)
>
> > Ah but we have an in_interrupt() check in context_tracking_user_enter() that protects
> > us against that.
>
> Here you are relying on the exception being treated as an interrupt,
> correct?
>From an RCU point of view yeah. In these cases the exception is either protected under
rcu_irq_* and rcu_nmi* APIs, depending on where it happened.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists