lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 29 Sep 2020 02:17:19 +0100
From:   Matthew Wilcox <willy@...radead.org>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     Nick Piggin <npiggin@...il.com>, Hugh Dickins <hughd@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] page_alloc: Fix freeing non-compound pages

On Mon, Sep 28, 2020 at 06:03:07PM -0700, Andrew Morton wrote:
> On Sat, 26 Sep 2020 22:39:19 +0100 "Matthew Wilcox (Oracle)" <willy@...radead.org> wrote:
> 
> > Here is a very rare race which leaks memory:
> 
> Not worth a cc:stable?

Yes, it probably should have been.  I just assume the stablebot will
pick up anything that has a Fixes: tag.

> > Page P0 is allocated to the page cache.  Page P1 is free.
> > 
> > Thread A                Thread B                Thread C
> > find_get_entry():
> > xas_load() returns P0
> > 						Removes P0 from page cache
> > 						P0 finds its buddy P1
> > 			alloc_pages(GFP_KERNEL, 1) returns P0
> > 			P0 has refcount 1
> > page_cache_get_speculative(P0)
> > P0 has refcount 2
> > 			__free_pages(P0)
> 
> 			__free_pages(P0, 1), I assume.

Good catch.  That was what I meant to type.

> > 			P0 has refcount 1
> > put_page(P0)
> 
> but this is implicitly order 0

Right, because it's not a compound page.

> > P1 is not freed
> 
> huh.

Yeah.  Nasty, and we'll never know how often it was hit.

> > Fix this by freeing all the pages in __free_pages() that won't be freed
> > by the call to put_page().  It's usually not a good idea to split a page,
> > but this is a very unlikely scenario.
> > 
> > ...
> >
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -4947,6 +4947,9 @@ void __free_pages(struct page *page, unsigned int order)
> >  {
> >  	if (put_page_testzero(page))
> >  		free_the_page(page, order);
> > +	else if (!PageHead(page))
> > +		while (order-- > 0)
> > +			free_the_page(page + (1 << order), order);
> 
> Well that's weird and scary looking.  `page' has non-zero refcount yet
> we go and free random followon pages.  Methinks it merits an
> explanatory comment?

Well, poot.  I lost that comment in the shuffling of patches.  In a
different tree, I have:

@@ -4943,10 +4943,19 @@ static inline void free_the_page(struct page *page, unsi
gned int order)
                __free_pages_ok(page, order);
 }
 
+/*
+ * If we free a non-compound allocation, another thread may have a
+ * speculative reference to the first page.  It has no way of knowing
+ * about the rest of the allocation, so we have to free all but the
+ * first page here.
+ */
 void __free_pages(struct page *page, unsigned int order)
 {
        if (put_page_testzero(page))
                free_the_page(page, order);
+       else if (!PageHead(page))
+               while (order-- > 0)
+                       free_the_page(page + (1 << order), order);
 }
 EXPORT_SYMBOL(__free_pages);
 

Although I'm now thinking of making that comment into kernel-doc and
turning it into advice to the caller rather than an internal note to
other mm developers.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ