linux-kernel - Re: [PATCH RFC tip/core/rcu 0/4] Forbid static SRCU use in modules

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1447252022.1166.1554734972823.JavaMail.zimbra@efficios.com>
Date:   Mon, 8 Apr 2019 10:49:32 -0400 (EDT)
From:   Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:     paulmck <paulmck@...ux.ibm.com>
Cc:     "Joel Fernandes, Google" <joel@...lfernandes.org>,
        rcu <rcu@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        dipankar <dipankar@...ibm.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Josh Triplett <josh@...htriplett.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        rostedt <rostedt@...dmis.org>,
        David Howells <dhowells@...hat.com>,
        Eric Dumazet <edumazet@...gle.com>,
        fweisbec <fweisbec@...il.com>, Oleg Nesterov <oleg@...hat.com>,
        linux-nvdimm <linux-nvdimm@...ts.01.org>,
        dri-devel <dri-devel@...ts.freedesktop.org>,
        amd-gfx <amd-gfx@...ts.freedesktop.org>
Subject: Re: [PATCH RFC tip/core/rcu 0/4] Forbid static SRCU use in modules

----- On Apr 8, 2019, at 10:22 AM, paulmck paulmck@...ux.ibm.com wrote:

> On Mon, Apr 08, 2019 at 09:05:34AM -0400, Mathieu Desnoyers wrote:
>> ----- On Apr 7, 2019, at 10:27 PM, paulmck paulmck@...ux.ibm.com wrote:
>> 
>> > On Sun, Apr 07, 2019 at 09:07:18PM +0000, Joel Fernandes wrote:
>> >> On Sun, Apr 07, 2019 at 04:41:36PM -0400, Mathieu Desnoyers wrote:
>> >> > 
>> >> > ----- On Apr 7, 2019, at 3:32 PM, Joel Fernandes, Google joel@...lfernandes.org
>> >> > wrote:
>> >> > 
>> >> > > On Sun, Apr 07, 2019 at 03:26:16PM -0400, Mathieu Desnoyers wrote:
>> >> > >> ----- On Apr 7, 2019, at 9:59 AM, paulmck paulmck@...ux.ibm.com wrote:
>> >> > >> 
>> >> > >> > On Sun, Apr 07, 2019 at 06:39:41AM -0700, Paul E. McKenney wrote:
>> >> > >> >> On Sat, Apr 06, 2019 at 07:06:13PM -0400, Joel Fernandes wrote:
>> >> > >> > 
>> >> > >> > [ . . . ]
>> >> > >> > 
>> >> > >> >> > > diff --git a/include/asm-generic/vmlinux.lds.h
>> >> > >> >> > > b/include/asm-generic/vmlinux.lds.h
>> >> > >> >> > > index f8f6f04c4453..c2d919a1566e 100644
>> >> > >> >> > > --- a/include/asm-generic/vmlinux.lds.h
>> >> > >> >> > > +++ b/include/asm-generic/vmlinux.lds.h
>> >> > >> >> > > @@ -338,6 +338,10 @@
>> >> > >> >> > >  		KEEP(*(__tracepoints_ptrs)) /* Tracepoints: pointer array */ \
>> >> > >> >> > >  		__stop___tracepoints_ptrs = .;				\
>> >> > >> >> > >  		*(__tracepoints_strings)/* Tracepoints: strings */	\
>> >> > >> >> > > +		. = ALIGN(8);						\
>> >> > >> >> > > +		__start___srcu_struct = .;				\
>> >> > >> >> > > +		*(___srcu_struct_ptrs)					\
>> >> > >> >> > > +		__end___srcu_struct = .;				\
>> >> > >> >> > >  	}								\
>> >> > >> >> > 
>> >> > >> >> > This vmlinux linker modification is not needed. I tested without it and srcu
>> >> > >> >> > torture works fine with rcutorture built as a module. Putting further prints
>> >> > >> >> > in kernel/module.c verified that the kernel is able to find the srcu structs
>> >> > >> >> > just fine. You could squash the below patch into this one or apply it on top
>> >> > >> >> > of the dev branch.
>> >> > >> >> 
>> >> > >> >> Good point, given that otherwise FORTRAN named common blocks would not
>> >> > >> >> work.
>> >> > >> >> 
>> >> > >> >> But isn't one advantage of leaving that stuff in the RO_DATA_SECTION()
>> >> > >> >> macro that it can be mapped read-only?  Or am I suffering from excessive
>> >> > >> >> optimism?
>> >> > >> > 
>> >> > >> > And to answer the other question, in the case where I am suffering from
>> >> > >> > excessive optimism, it should be a separate commit.  Please see below
>> >> > >> > for the updated original commit thus far.
>> >> > >> > 
>> >> > >> > And may I have your Tested-by?
>> >> > >> 
>> >> > >> Just to confirm: does the cleanup performed in the modules going
>> >> > >> notifier end up acting as a barrier first before freeing the memory ?
>> >> > >> If not, is it explicitly stated that a barrier must be issued before
>> >> > >> module unload ?
>> >> > >> 
>> >> > > 
>> >> > > You mean rcu_barrier? It is mentioned in the documentation that this is the
>> >> > > responsibility of the module writer to prevent delays for all modules.
>> >> > 
>> >> > It's a srcu barrier yes. Considering it would be a barrier specific to the
>> >> > srcu domain within that module, I don't see how it would cause delays for
>> >> > "all" modules if we implicitly issue the barrier on module unload. What
>> >> > am I missing ?
>> >> 
>> >> Yes you are right. I thought of this after I just sent my email. I think it
>> >> makes sense for srcu case to do and could avoid a class of bugs.
>> > 
>> > If there are call_srcu() callbacks outstanding, the module writer still
>> > needs the srcu_barrier() because otherwise callbacks arrive after
>> > the module text has gone, which will be disappoint the CPU when it
>> > tries fetching instructions that are no longer mapped.  If there are
>> > no call_srcu() callbacks from that module, then there is no need for
>> > srcu_barrier() either way.
>> > 
>> > So if an srcu_barrier() is needed, the module developer needs to
>> > supply it.
>> 
>> When you say "callbacks arrive after the module text has gone",
>> I think you assume that free_module() is invoked before the
>> MODULE_STATE_GOING notifiers are called. But it's done in the
>> opposite order: going notifiers are called first, and then
>> free_module() is invoked.
>> 
>> So AFAIU it would be safe to issue the srcu_barrier() from the module
>> going notifier.
>> 
>> Or am I missing something ?
> 
> We do seem to be talking past each other.  ;-)
> 
> This has nothing to do with the order of events at module-unload time.
> 
> So please let me try again.
> 
> If a given srcu_struct in a module never has call_srcu() invoked, there
> is no need to invoke rcu_barrier() at any time, whether at module-unload
> time or not.  Adding rcu_barrier() in this case adds overhead and latency
> for no good reason.

Not if we invoke srcu_barrier() for that specific domain. If
call_srcu was never invoked for a srcu domain, I don't see why
srcu_barrier() should be more expensive than a simple check that
the domain does not have any srcu work queued.

> 
> If a given srcu_struct in a module does have at least one call_srcu()
> invoked, it is already that module's responsibility to make sure that
> the code sticks around long enough for the callback to be invoked.

I understand that when users do explicit dynamic allocation/cleanup of
srcu domains, they indeed need to take care of doing explicit srcu_barrier().
However, if they do static definition of srcu domains, it would be nice
if we can handle the barriers under the hood.

> 
> This means that correct SRCU users that invoke call_srcu() already
> have srcu_barrier() at module-unload time.  Incorrect SRCU users, with
> reasonable probability, now get a WARN_ON() at module-unload time, with
> the per-CPU state getting leaked.  Before this change, they would (also
> with reasonable probability) instead get an instruction-fetch fault when
> the SRCU callback was invoked after the completion of the module unload.
> Furthermore, in all cases where they would previously have gotten the
> instruction-fetch fault, they now get the WARN_ON(), like this:
> 
>	if (WARN_ON(rcu_segcblist_n_cbs(&sdp->srcu_cblist)))
>		return; /* Forgot srcu_barrier(), so just leak it! */
> 
> So this change already represents an improvement in usability.

Considering that we can do a srcu_barrier() for the specific domain,
and that it should add no noticeable overhead if there is no queued
callbacks, I don't see a good reason for leaving the srcu_barrier
invocation to the user rather than implicitly doing it from the
module going notifier.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com