[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170509163652.GS3956@linux.vnet.ibm.com>
Date: Tue, 9 May 2017 09:36:52 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Josh Poimboeuf <jpoimboe@...hat.com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
Petr Mladek <pmladek@...e.com>, Jessica Yu <jeyu@...hat.com>,
Jiri Kosina <jikos@...nel.org>,
Miroslav Benes <mbenes@...e.cz>, live-patching@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/3] livepatch/rcu: Warn when system consistency is
broken in RCU code
On Tue, May 09, 2017 at 11:18:35AM -0500, Josh Poimboeuf wrote:
> On Mon, May 08, 2017 at 03:36:00PM -0700, Paul E. McKenney wrote:
> > On Mon, May 08, 2017 at 05:16:09PM -0500, Josh Poimboeuf wrote:
> > > On Mon, May 08, 2017 at 02:07:54PM -0700, Paul E. McKenney wrote:
> > > > This would be a problem if step 2's NMI hit rcu_irq_enter(),
> > > > rcu_irq_exit(), and friends in just the wrong place.
> > > >
> > > > I would suggest that ftrace() do something like this...
> > > >
> > > > if (in_nmi())
> > > > rcu_nmi_enter();
> > > > else
> > > > rcu_irq_enter();
> > > >
> > > > Except that, as Steven will quickly point out, this won't work at the
> > > > very edges of the NMI, when NMI_MASK won't be set in preempt_count().
> > > >
> > > > Other thoughts?
> > >
> > > Ok. So I think the livepatch ftrace handler would need the in_nmi()
> > > check, in case it's called early in the NMI.
> > >
> > > But on x86, rcu_nmi_enter() is also called in some non-NMI exception
> > > cases, from ist_enter(). So it appears that the in_nmi() check wouldn't
> > > be sufficient. We might instead need something like:
> > >
> > > if (in_nmi() || in_some_other_exception())
> > > rcu_nmi_enter();
> > > else
> > > rcu_irq_enter();
> > >
> > > But unfortunately the in_some_other_exception() function doesn't
> > > currently exist.
> > >
> > > So, one more question. Would it work if we just always called
> > > rcu_nmi_enter()?
> >
> > I am a bit nervous about this. It would -at- -least- be necessary to have
> > interrupts disabled throughout the entire time from the rcu_nmi_enter()
> > through the matching rcu_nmi_exit(). And there might be other failure
> > modes that I don't immediately see.
>
> Ok, let's forget about that idea for now then :-)
Whew!!! ;-)
> > But do we really need this, given the in_nmi() check that Steven
> > pointed out?
>
> The in_nmi() check doesn't work for non-NMI exceptions. An exception
> can come from anywhere, which is presumably why ist_enter() calls
> rcu_nmi_enter(), even though it might not have been in NMI context. The
> exception could, for example, happen while you're twiddling important
> bits in rcu_irq_enter(). Or it could happen early in do_nmi(), before
> it had a chance to set NMI_MASK or call rcu_nmi_enter(). In either
> case, in_nmi() would be false, yet calling rcu_irq_enter() would be bad.
>
> I think I have convinced myself that, as long as the user doesn't patch
> ist_enter() or rcu_dynticks_eqs_enter(), it'll be fine. So the
> following should be sufficient:
>
> if (in_nmi())
> rcu_nmi_enter(); /* in case we're called before nmi_enter() */
> else
> rcu_irq_enter_irqson();
>
> if (unlikely(!rcu_is_watching())) {
> klp_block_patch_removal = true;
> WARN_ON_ONCE(1); /* this presumably means */
> }
As long as you have a similar setup on exit, so that each call to
rcu_nmi_enter() is balanced by a corresponding call to rcu_nmi_exit().
Ditto for rcu_irq_enter_irqson(), of course.
> I think the alternative, calling rcu_irq_enter_disabled() beforehand,
> isn't sufficient, because it only checks the rcu_dynticks_eqs_enter()
> case. It doesn't check the IST exception ist_enter() case, before
> rcu_nmi_enter() has been called.
Yes, calling rcu_irq_enter_disabled() beforehand would be unfortunate
if this was an NMI that occurred in just the wrong place in (say)
rcu_irq_enter(). ;-)
Thanx, Paul
Powered by blists - more mailing lists