lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230418185608.GA4907@monkey>
Date:   Tue, 18 Apr 2023 11:56:08 -0700
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Tarun Sahu <tsahu@...ux.ibm.com>, linux-mm@...ck.org,
        akpm@...ux-foundation.org, muchun.song@...ux.dev,
        aneesh.kumar@...ux.ibm.com, sidhartha.kumar@...cle.com,
        gerald.schaefer@...ux.ibm.com, linux-kernel@...r.kernel.org,
        jaypatel@...ux.ibm.com
Subject: Re: [PATCH] mm/folio: Avoid special handling for order value 0 in
 folio_set_order

On 04/14/23 21:12, Matthew Wilcox wrote:
> On Sat, Apr 15, 2023 at 01:18:32AM +0530, Tarun Sahu wrote:
> > folio_set_order(folio, 0); which is an abuse of folio_set_order as 0-order
> > folio does not have any tail page to set order.
> 
> I think you're missing the point of how folio_set_order() is used.
> When splitting a large folio, we need to zero out the folio_nr_pages
> in the tail, so it does have a tail page, and that tail page needs to
> be zeroed.  We even assert that there is a tail page:
> 
>         if (WARN_ON_ONCE(!folio_test_large(folio)))
>                 return;
> 
> Or maybe you need to explain yourself better.
> 
> > folio->_folio_nr_pages is
> > set to 0 for order 0 in folio_set_order. It is required because
> > _folio_nr_pages overlapped with page->mapping and leaving it non zero
> > caused "bad page" error while freeing gigantic hugepages. This was fixed in
> > Commit ba9c1201beaa ("mm/hugetlb: clear compound_nr before freeing gigantic
> > pages"). Also commit a01f43901cfb ("hugetlb: be sure to free demoted CMA
> > pages to CMA") now explicitly clear page->mapping and hence we won't see
> > the bad page error even if _folio_nr_pages remains unset. Also the order 0
> > folios are not supposed to call folio_set_order, So now we can get rid of
> > folio_set_order(folio, 0) from hugetlb code path to clear the confusion.
> 
> ... this is all very confusing.
> 
> > The patch also moves _folio_set_head and folio_set_order calls in
> > __prep_compound_gigantic_folio() such that we avoid clearing them in the
> > error path.
> 
> But don't we need those bits set while we operate on the folio to set it
> up?  It makes me nervous if we don't have those bits set because we can
> end up with speculative references that point to a head page while that
> page is not marked as a head page.  It may not be a problem, but I want
> to see some air-tight analysis of that.

I am fairly certain we are 'safe'.  Here is code before setting up the
pointer to the head page.

		 * In the case of demote, the ref count will be zero.
		 */
		if (!demote) {
			if (!page_ref_freeze(p, 1)) {
				pr_warn("HugeTLB page can not be used due to unexpected inflated ref count\n");
				goto out_error;
			}
		} else {
			VM_BUG_ON_PAGE(page_count(p), p);
		}
		if (i != 0)
			set_compound_head(p, &folio->page);

So, before setting the pointer to head page ref count will be zero.

I 'think' it would actually be better to move the calls to _folio_set_head and
folio_set_order in __prep_compound_gigantic_folio() as suggested here.  Why?
In the current code, the ref count on the 'head page' is still 1 (or more)
while those calls are made.  So, someone could take a speculative ref on the
page BEFORE the tail pages are set up.

TBH, I do not have much of an opinion about potential confusion surrounding
folio_set_compound_order(folio, 0).  IIUC, hugetlb gigantic page setup is the
only place outside the page allocation code that sets up compound pages/large
folios.  So, it is going to be a bit 'special'.  As mentioned,  when this was
originally discussed I suggested folio_clear_order().  I would be happy with
either.
-- 
Mike Kravetz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ