lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20230124152306.wozuqyl7ehl7mzjx@revolver>
Date:   Tue, 24 Jan 2023 10:23:06 -0500
From:   "Liam R. Howlett" <Liam.Howlett@...cle.com>
To:     David Hildenbrand <david@...hat.com>
Cc:     Mike Rapoport <rppt@...nel.org>,
        "maple-tree@...ts.infradead.org" <maple-tree@...ts.infradead.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Jirka Hladky <jhladky@...hat.com>,
        Matthew Wilcox <willy@...radead.org>
Subject: Re: [PATCH] maple_tree: Remove GFP_ZERO from kmem_cache_alloc() and
 kmem_cache_alloc_bulk()

* David Hildenbrand <david@...hat.com> [230124 08:18]:
> On 06.01.23 19:36, Liam Howlett wrote:
> > * Mike Rapoport <rppt@...nel.org> [230106 02:28]:
> > > On Thu, Jan 05, 2023 at 04:05:34PM +0000, Liam Howlett wrote:
> > > > Preallocations are common in the VMA code to avoid allocating under
> > > > certain locking conditions.  The preallocations must also cover the
> > > > worst-case scenario.  Removing the GFP_ZERO flag from the
> > > > kmem_cache_alloc() (and bulk variant) calls will reduce the amount of
> > > > time spent zeroing memory that may not be used.  Only zero out the
> > > > necessary area to keep track of the allocations in the maple state.
> > > > Zero the entire node prior to using it in the tree.
> > > > 
> > > > This required internal changes to node counting on allocation, so the
> > > > test code is also updated.
> > > > 
> > > > This restores some micro-benchmark performance:
> > > > up to +9% in mmtests mmap1 by my testing
> > > > +10% to +20% in mmap, mmapaddr, mmapmany tests reported by Red Hat
> > > > 
> > > > Link: https://bugzilla.redhat.com/show_bug.cgi?id=2149636
> > > > Reported-by: Jirka Hladky <jhladky@...hat.com>
> > > > Suggested-by: Matthew Wilcox (Oracle) <willy@...radead.org>
> > > > Signed-off-by: Liam Howlett <Liam.Howlett@...cle.com>
> > > > ---
> > > >   lib/maple_tree.c                 | 80 +++++++++++++++++---------------
> > > >   tools/testing/radix-tree/maple.c | 18 +++----
> > > >   2 files changed, 52 insertions(+), 46 deletions(-)
> > > > 
> > > > diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> > > > index 26e2045d3cda..82a8121fe49b 100644
> > > > --- a/lib/maple_tree.c
> > > > +++ b/lib/maple_tree.c
> > > > @@ -149,13 +149,12 @@ struct maple_subtree_state {
> > > >   /* Functions */
> > > >   static inline struct maple_node *mt_alloc_one(gfp_t gfp)
> > > >   {
> > > > -	return kmem_cache_alloc(maple_node_cache, gfp | __GFP_ZERO);
> > > > +	return kmem_cache_alloc(maple_node_cache, gfp);
> > > >   }
> > > >   static inline int mt_alloc_bulk(gfp_t gfp, size_t size, void **nodes)
> > > >   {
> > > > -	return kmem_cache_alloc_bulk(maple_node_cache, gfp | __GFP_ZERO, size,
> > > > -				     nodes);
> > > > +	return kmem_cache_alloc_bulk(maple_node_cache, gfp, size, nodes);
> > > >   }
> > > >   static inline void mt_free_bulk(size_t size, void __rcu **nodes)
> > > > @@ -1127,9 +1126,10 @@ static inline struct maple_node *mas_pop_node(struct ma_state *mas)
> > > >   {
> > > >   	struct maple_alloc *ret, *node = mas->alloc;
> > > >   	unsigned long total = mas_allocated(mas);
> > > > +	unsigned int req = mas_alloc_req(mas);
> > > >   	/* nothing or a request pending. */
> > > > -	if (unlikely(!total))
> > > > +	if (WARN_ON(!total))
> > > 
> > > Hmm, isn't WARN_ON() here too much?
> > 
> > I don't think so.  If we get to the point of asking for a node while we
> > don't have any to give, then it's too late.  It means we (I) have
> > calculated the necessary nodes incorrectly and we won't have enough
> > memory to fit things into the tree.  It should never happen.
> 
> Either way, the suggestion is to use WARN_ON_ONCE() instead of WARN_ON().
> 
> ... now documented in Documentation/process/coding-style.rst
> 

Thanks, this is good information.  I was actually debating a BUG_ON()
but it has been documented to that WARN_ON() is better, but
WARN_ON_ONCE() is best.

I would like the stack trace for any code path that happens to cause it
even if there was a previous code path that has already triggered the
backtrace.

The memory should have all been obtained or entered a previous error
path so an OOM issue will not cause a cascade of flooding backtraces.

This would also result in a null pointer dereference today, so I think
it's safe to say it's rare.

Thanks,
Liam

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ