lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c61cdf1d-2feb-ecb3-393d-ca25175c73f4@oracle.com>
Date:   Mon, 11 Jan 2021 17:20:51 -0800
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Muchun Song <songmuchun@...edance.com>, akpm@...ux-foundation.org,
        mhocko@...e.cz
Cc:     n-horiguchi@...jp.nec.com, ak@...ux.intel.com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 4/6] mm: hugetlb: add return -EAGAIN for
 dissolve_free_huge_page

On 1/10/21 4:40 AM, Muchun Song wrote:
> There is a race between dissolve_free_huge_page() and put_page(),
> and the race window is quite small. Theoretically, we should return
> -EBUSY when we encounter this race. In fact, we have a chance to
> successfully dissolve the page if we do a retry. Because the race
> window is quite small. If we seize this opportunity, it is an
> optimization for increasing the success rate of dissolving page.
> 
> If we free a HugeTLB page from a non-task context, it is deferred
> through a workqueue. In this case, we need to flush the work.
> 
> The dissolve_free_huge_page() can be called from memory hotplug,
> the caller aims to free the HugeTLB page to the buddy allocator
> so that the caller can unplug the page successfully.
> 
> Signed-off-by: Muchun Song <songmuchun@...edance.com>
> ---
>  mm/hugetlb.c | 26 +++++++++++++++++++++-----
>  1 file changed, 21 insertions(+), 5 deletions(-)

I am unsure about the need for this patch.  The code is OK, there are no
issues with the code.

As mentioned in the commit message, this is an optimization and could
potentially cause a memory offline operation to succeed instead of fail.
However, we are very unlikely to ever exercise this code.  Adding an
optimization that is unlikely to be exercised is certainly questionable.

Memory offline is the only code that could benefit from this optimization.
As someone with more memory offline user experience, what is your opinion
Michal?

-- 
Mike Kravetz

> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 4a9011e12175..a176ceed55f1 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1763,10 +1763,11 @@ static int free_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed,
>   * nothing for in-use hugepages and non-hugepages.
>   * This function returns values like below:
>   *
> - *  -EBUSY: failed to dissolved free hugepages or the hugepage is in-use
> - *          (allocated or reserved.)
> - *       0: successfully dissolved free hugepages or the page is not a
> - *          hugepage (considered as already dissolved)
> + *  -EAGAIN: race with __free_huge_page() and can do a retry
> + *  -EBUSY:  failed to dissolved free hugepages or the hugepage is in-use
> + *           (allocated or reserved.)
> + *       0:  successfully dissolved free hugepages or the page is not a
> + *           hugepage (considered as already dissolved)
>   */
>  int dissolve_free_huge_page(struct page *page)
>  {
> @@ -1793,8 +1794,10 @@ int dissolve_free_huge_page(struct page *page)
>  		 * We should make sure that the page is already on the free list
>  		 * when it is dissolved.
>  		 */
> -		if (unlikely(!PageHugeFreed(head)))
> +		if (unlikely(!PageHugeFreed(head))) {
> +			rc = -EAGAIN;
>  			goto out;
> +		}
>  
>  		/*
>  		 * Move PageHWPoison flag from head page to the raw error page,
> @@ -1813,6 +1816,14 @@ int dissolve_free_huge_page(struct page *page)
>  	}
>  out:
>  	spin_unlock(&hugetlb_lock);
> +
> +	/*
> +	 * If the freeing of the HugeTLB page is put on a work queue, we should
> +	 * flush the work before retrying.
> +	 */
> +	if (unlikely(rc == -EAGAIN))
> +		flush_work(&free_hpage_work);
> +
>  	return rc;
>  }
>  
> @@ -1835,7 +1846,12 @@ int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
>  
>  	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order) {
>  		page = pfn_to_page(pfn);
> +retry:
>  		rc = dissolve_free_huge_page(page);
> +		if (rc == -EAGAIN) {
> +			cpu_relax();
> +			goto retry;
> +		}
>  		if (rc)
>  			break;
>  	}
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ