lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200731212457.GS9247@paulmck-ThinkPad-P72>
Date:   Fri, 31 Jul 2020 14:24:57 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, cl@...ux.com,
        penberg@...nel.org, rientjes@...gle.com, iamjoonsoo.kim@....com,
        hannes@...xchg.org, urezki@...il.com, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: Raw spinlocks and memory allocation

On Fri, Jul 31, 2020 at 09:59:33PM +0100, Matthew Wilcox wrote:
> On Fri, Jul 31, 2020 at 01:48:55PM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 31, 2020 at 01:38:34PM -0700, Andrew Morton wrote:
> > > On Thu, 30 Jul 2020 16:12:05 -0700 "Paul E. McKenney" <paulmck@...nel.org> wrote:
> > > 
> > > > So, may we add a GFP_ flag that will cause kmalloc() and friends to return
> > > > NULL when they would otherwise need to acquire their non-raw spinlock?
> > > > This avoids adding any overhead to the slab-allocator fastpaths, but
> > > > allows callback invocation to reduce cache misses without having to
> > > > restructure some existing callers of call_rcu() and potential future
> > > > callers of kfree_rcu().
> > > 
> > > We have eight free gfp_t bits so that isn't a problem.
> > 
> > Whew!!!  ;-)
> > 
> > > Adding a test-n-branch to the kmalloc() fastpath may well be a concern.
> > > 
> > > Which of mm/sl?b.c are affected?
> > 
> > None of them, it turns out.  The initial patch will instead directly
> > invoke __get_free_page().  So we could just leave sl?b.c alone.
> 
> Isn't that spelled GFP_NOWAIT?

I don't think so in the current kernel, though I might be confused.

The problem we are having isn't waiting, but rather normal spinlock_t
acquisition.  This does not count as waiting in !CONFIG_PREEMPT_RT
kernels, and so there are code paths that acquire the non-raw zone_lock
in rmqueue_bulk() even in the GFP_NOWAIT case.  Because kfree_rcu()
and call_rcu() and their callers might hold raw spinlocks, acquiring a
non-raw spinlock is forbidden for them and for anything that they call,
directly or indirectly.

The reason for this restriction is that in -rt, the spin_lock(&zone->lock)
in rmqueue_bulk() can sleep.  This conversion of non-raw spinlocks
to sleeplocks is part of how -rt reduces scheduling latency.  Because
acquiring a raw spinlock disables preemption (even in -rt), acquiring
a non-raw spinlock while holding a raw spinlock gets you "scheduling
while atomic" in -rt.  And it will get you lockdep complaints in all
kernels, not just -rt, when CONFIG_PROVE_RAW_LOCK_NESTING is enabled.
And my guess is that CONFIG_PROVE_RAW_LOCK_NESTING=y will become the
default sooner rather than later.

But you are right that yet another approach might be modifying the
GFP_NOWAIT handling so that it avoided acquiring non-raw spinlocks.
However, evaluating that option requires quite a bit more knowledge of
MM than I have!  ;-)

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ