linux-kernel - Re: [RFC-PATCH 1/2] mm: Add __GFP_NO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200814161904.GD2674@hirez.programming.kicks-ass.net>
Date:   Fri, 14 Aug 2020 18:19:04 +0200
From:   peterz@...radead.org
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Michal Hocko <mhocko@...e.com>,
        Uladzislau Rezki <urezki@...il.com>,
        LKML <linux-kernel@...r.kernel.org>, RCU <rcu@...r.kernel.org>,
        linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Matthew Wilcox <willy@...radead.org>,
        "Theodore Y . Ts'o" <tytso@....edu>,
        Joel Fernandes <joel@...lfernandes.org>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Oleksiy Avramchenko <oleksiy.avramchenko@...ymobile.com>
Subject: Re: [RFC-PATCH 1/2] mm: Add __GFP_NO_LOCKS flag

On Fri, Aug 14, 2020 at 07:14:25AM -0700, Paul E. McKenney wrote:

> Doing this to kfree_rcu() is the first step.  We will also be doing this
> to call_rcu(), which has some long-standing invocations from various
> raw contexts, including hardirq handler.

Most hardirq handler are not raw on RT due to threaded interrupts.
Lockdep knows about this.

> > >   4) As kfree_rcu() can be used from any context except NMI and RT
> > >      relies on it we ran into a circular dependency problem.
> > 
> > Well, which actual usage sites are under a raw spinlock? Most of the
> > ones I could find are not.
> 
> There are some on their way in, but this same optimization will be applied
> to call_rcu(), and there are no shortage of such call sites in that case.
> And these call sites have been around for a very long time.

Luckily there is lockdep to help you find the ones that need converting
to raw_call_rcu() :-)

> > >   Clearly the simplest solution but not Pauls favourite and
> > >   unfortunately he has a good reason.
> > 
> > Which isn't actually stated anywhere I suppose ?
> 
> Several times, but why not one more?  ;-)
> 
> In CONFIG_PREEMPT_NONE=y kernels, which are heavily used, and for which
> the proposed kfree_rcu() and later call_rcu() optimizations are quite
> important, there is no way to tell at runtime whether or you are in
> atomic raw context.

CONFIG_PREEMPT_NONE has nothing what so ever to do with any of this.

> > > > 2.	Adding a GFP_ flag.
> > > 
> > >   Michal does not like the restriction for !RT kernels and tries to
> > >   avoid the introduction of a new allocation mode.
> > 
> > Like above, I tend to be with Michal on this, just wrap the actual
> > allocation in CONFIG_PREEMPT_RT, the code needs to handle a NULL pointer
> > there anyway.
> 
> That disables the optimization in the CONFIG_PREEMPT_NONE=y case,
> where it really is needed.

No, it disables it for CONFIG_PREEMPT_RT.

> I would be OK with either.  In CONFIG_PREEMPT_NONE=n kernels, the
> kfree_rcu() code could use preemptible() to determine whether it was safe
> to invoke the allocator.  The code in kfree_rcu() might look like this:
> 
> 	mem = NULL;
> 	if (IS_ENABLED(CONFIG_PREEMPT_NONE) || preemptible())
> 		mem = __get_free_page(...);
> 
> Is your point is that the usual mistakes would then be caught by the
> usual testing on CONFIG_PREEMPT_NONE=n kernels?

	mem = NULL;
#if !defined(CONFIG_PREEMPT_RT) && !defined(CONFIG_PROVE_LOCKING)
	mem = __get_free_page(...)
#endif
	if (!mem)

But I _really_ would much rather have raw_kfree_rcu() and raw_call_rcu()
variants for the few places that actually need it.

> > >  These are not seperate of each other as #3 requires #4. The most
> > >  horrible solution IMO from a technical POV as it proliferates
> > >  inconsistency for no good reaosn.
> > > 
> > >  Aside of that it'd be solving a problem which does not exist simply
> > >  because kfree_rcu() does not depend on it and we really don't want to
> > >  set precedence and encourage the (ab)use of this in any way.
> > 
> > My preferred solution is 1, if you want to use an allocator, you simply
> > don't get to play under raw_spinlock_t. And from a quick grep, most
> > kfree_rcu() users are not under raw_spinlock_t context.
> 
> There is at least one on its way in, but more to the point, we will
> need to apply this same optimization to call_rcu(), which is used in

There is no need, call_rcu() works perfectly fine today, thank you.
You might want to, but that's an entirely different thing.

> raw atomic context, including from hardirq handlers.

Threaded IRQs. There really is very little code that is 'raw' on RT.