linux-kernel - Re: [RFC 2/2] rcu: Remove ->dynticks_nmi_nesting from struct rcu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 25 Jun 2018 12:15:50 -0700
From:   Joel Fernandes <joel@...lfernandes.org>
To:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:     Byungchul Park <byungchul.park@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Byungchul Park <max.byungchul.park@...il.com>,
        jiangshanlai@...il.com, josh@...htriplett.org,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        linux-kernel@...r.kernel.org, kernel-team@....com, luto@...nel.org
Subject: Re: [RFC 2/2] rcu: Remove ->dynticks_nmi_nesting from struct
 rcu_dynticks

Hi Paul,

Thanks a lot for your comments, my replies inline:

On Mon, Jun 25, 2018 at 10:19:20AM -0700, Paul E. McKenney wrote:
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> 	When I traced rdtp->dynticks_nesting, I could only find its
> 	value to be either a 0 or a 1. However looking back at old kernel
> 	sources, it appears that these can be nested becaues of so called
> 	“half-interrupts”. I believe these are basically interrupts
> 	that cause a transition to usermode due to usermode upcalls
> 	(usermode helper subsystem). So a nesting situation could be
> 	something like: 1. Transition from idle to process context which
> 	makes dynticks_nesting == 1. Next, an interrupt comes in which
> 	makes a usermode upcall. This usermode call now makes a system
> 	call causing entry back into process context, which increments
> 	the dynticks_nesting counter to 2. Such a crazy situation is
> 	perhaps possible.
> 
> The half-interrupts can instead cause ->dynticks_nmi_nesting to either
> fail to return to zero or to go negative, depending on which half of

Actually in the above paragraph I was referring to a "half interrupt" messing
up dynticks_nesting, not dynticks_nmi_nesting. I know that the latter can be
messed up too but I wasn't referring to dynticks_nmi_nesting in this part of
the article.

I was thinking more in terms of the comment in:
https://elixir.bootlin.com/linux/v3.19.8/source/kernel/rcu/rcu.h#L34
/*
 * Process-level increment to ->dynticks_nesting field.  This allows for
 * architectures that use half-interrupts and half-exceptions from
 * process context.
  ... */

 In my hypothetical example above that you quoted from my notes, I was trying
 to reason about how taking a half-interrupt in process context can cause
 dynticks_nesting to increase to 2. Thinking some more though, I am not sure
 how the above hypothetical example I mentioned can cause this ;) since the
 transition to usermode from the half-interrupt should have corrected the
 dynticks_nesting counter due to the callchain: rcu_user_enter->rcu_eqs_enter ?

> the interrupt was present.  I don't immediately recall the reason for
> allowing nested process-level entry/exit.  Might be another place to
> put a WARN_ON_ONCE(), as eliminating this capability would save another
> conditional branch.

Sure, sounds good to me.

> 
> 	Any time the rdtp->dynticks counter’s second-lowest most bit
> 	is not set, we are in an EQS, and if its set, then we are not
> 	(second lowest because lowest is reserved for something else as
> 	of v4.18-rc1). This function is not useful to check if we’re
> 	in an EQS from a timer tick though, because its possible the
> 	timer tick interrupt entry caused an EQS exit which updated
> 	the counter. IOW, the ‘dynticks’ counter is not capable of
> 	checking if we had already exited the EQS before. To check if
> 	we were in an EQS or not from the timer tick, we instead must
> 	use dynticks_nesting counter. More on that later. The above
> 	function is probably just useful to make sure that interrupt
> 	entry/exit is properly updating the dynticks counter, and also
> 	to make sure from non-interrupt context that RCU is in an EQS
> 	(see rcu_gp_fqs function).
> 
> You lost me on this one.  There is rcu_is_cpu_rrupt_from_idle(), but
> I am not sure what you are trying to achieve here, so I am not sure
> whether this function does what you want.

Sorry about that. Let me try to explain in detail about why I wrote the above
paragraph when talking about rdtp->dynticks.

I was trying to determine how the RCU code determines if the CPU is idle. It
appears from the code that there are 2 ways it does so:

1. By calling rcu_is_cpu_rrupt_from_idle() which checks for the
dynticks_nesting counter. If the counter is 0, then CPU was idle at the time
of the check. This is how rcu_check_callbacks knows that the CPU was idle.

2. By checking for evenness of the dynticks counter. If its even we were idle
(or perhaps in usermode, but I think that extra inference doesn't hurt). This
is done in rcu_dynticks_curr_cpu_in_eqs.

So basically, there are 2 different counters that seem to serve the same
purpose as far as determining if we're in an idle EQS state goes. Right?

Then I was trying to see why we can't just use method 2. in
rcu_check_callbacks to determine if the "timer interrupt was taken while the
CPU was idle".  rcu_check_callbacks could simply call
rcu_dynticks_curr_cpu_in_eqs() from rcu_check_callbacks(). I was trying to
convince myself why that wouldn't work.

I concluded that that wouldn't work because the timer interrupt that led to
the rcu_check_callbacks() call would have tainted the dynticks counter
because of it would have called rcu_nmi_enter() during interrupt entry. So
there's no way to know if the CPU was really idle at the time of the
interrupt if we were to rely on rcu_dynticks_curr_cpu_in_eqs for that. Hence
we would need to rely on method 1 for the "did I take an interrupt while I
was idle" in rcu_check_callbacks() function which uses the dynticks_nesting
counter for this determination. Does that make sense?

> 
> 	When dynticks_nesting is decremented to 0 (the outermost
> 	process-context nesting level exit causes an eqs-entry), the
> 	dynticks_nmi_nesting is reset to
> 
> I think you want "0." at the end of this sentence.  Or maybe my browser
> is messing things up.

Yes the 0. was on the next line, but I moved it back to the previous line so
its easier to read. Thanks for letting me know.

Thanks!

 - Joel