lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171108142528.vsrkkqw6fihxdjio@dhcp22.suse.cz>
Date:   Wed, 8 Nov 2017 15:25:28 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Vitaly Kuznetsov <vkuznets@...hat.com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Mel Gorman <mgorman@...hsingularity.net>,
        YASUAKI ISHIMATSU <yasu.isimatu@...il.com>,
        Hillf Danton <hillf.zj@...baba-inc.com>,
        Johannes Weiner <hannes@...xchg.org>,
        "K. Y. Srinivasan" <kys@...rosoft.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        Alex Ng <alexng@...rosoft.com>
Subject: Re: [PATCH RFC] mm/memory_hotplug: make it possible to offline
 blocks with reserved pages

On Wed 08-11-17 14:01:55, Vitaly Kuznetsov wrote:
> Hyper-V balloon driver needs to hotplug memory in smaller chunks and to
> workaround Linux's 128Mb allignment requirement so it does a trick: partly
> populated 128Mb blocks are added and then a custom online_page_callback
> hook checks if the particular page is 'backed' during onlining, in case it
> is not backed it is left in Reserved state. When the host adds more pages
> to the block we bring them online from the driver (see
> hv_bring_pgs_online()/hv_page_online_one() in drivers/hv/hv_balloon.c).
> Eventually the whole block becomes fully populated and we hotplug the next
> 128Mb. This all works for quite some time already.

Why does HyperV needs to workaround the section size limit in the first
place? We are allocation memmap for the whole section anyway so it won't
save any memory. So the whole thing sounds rather dubious to me.

> What is not working is offlining of such partly populated blocks:
> check_pages_isolated_cb() callback will not pass with a sinle Reserved page
> and we end up with -EBUSY. However, there's no reason to fail offlining in
> this case: these pages are already offline, we may just skip them. Add the
> appropriate workaround to test_pages_isolated().

How do you recognize pages reserved by other users. You cannot simply
remove them, it would just blow up.

> Signed-off-by: Vitaly Kuznetsov <vkuznets@...hat.com>
> ---
> RFC part:
> - Other usages of Reserved pages making offlining blocks with them a no-go
>   may exist.
> - I'm not exactly sure that adding another parameter to
>   test_pages_isolated() is a good idea, we may go with a single flag for
>   both Reserved and HwPoisoned pages: we have just two call sites and they
>   have opposite needs (true, true in one case and false, false in the
>   other).
> ---
>  include/linux/page-isolation.h |  2 +-
>  mm/memory_hotplug.c            |  2 +-
>  mm/page_alloc.c                |  8 +++++++-
>  mm/page_isolation.c            | 11 ++++++++---
>  4 files changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
> index 05a04e603686..daba12a59574 100644
> --- a/include/linux/page-isolation.h
> +++ b/include/linux/page-isolation.h
> @@ -61,7 +61,7 @@ undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>   * Test all pages in [start_pfn, end_pfn) are isolated or not.
>   */
>  int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn,
> -			bool skip_hwpoisoned_pages);
> +			bool skip_hwpoisoned_pages, bool skip_reserved_pages);
>  
>  struct page *alloc_migrate_target(struct page *page, unsigned long private,
>  				int **resultp);
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index d4b5f29906b9..5b7d1482804f 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1467,7 +1467,7 @@ check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages,
>  {
>  	int ret;
>  	long offlined = *(long *)data;
> -	ret = test_pages_isolated(start_pfn, start_pfn + nr_pages, true);
> +	ret = test_pages_isolated(start_pfn, start_pfn + nr_pages, true, true);
>  	offlined = nr_pages;
>  	if (!ret)
>  		*(long *)data += offlined;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 77e4d3c5c57b..b475928c476c 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7632,7 +7632,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
>  	}
>  
>  	/* Make sure the range is really isolated. */
> -	if (test_pages_isolated(outer_start, end, false)) {
> +	if (test_pages_isolated(outer_start, end, false, false)) {
>  		pr_info_ratelimited("%s: [%lx, %lx) PFNs busy\n",
>  			__func__, outer_start, end);
>  		ret = -EBUSY;
> @@ -7746,6 +7746,12 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
>  			continue;
>  		}
>  
> +		/* Some pages might never be online, skip them */
> +		if (unlikely(PageReserved(page))) {
> +			pfn++;
> +			continue;
> +		}
> +
>  		BUG_ON(page_count(page));
>  		BUG_ON(!PageBuddy(page));
>  		order = page_order(page);
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 44f213935bf6..fd9c18e00b92 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -233,7 +233,8 @@ int undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>   */
>  static unsigned long
>  __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn,
> -				  bool skip_hwpoisoned_pages)
> +				  bool skip_hwpoisoned_pages,
> +				  bool skip_reserved_pages)
>  {
>  	struct page *page;
>  
> @@ -253,6 +254,9 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn,
>  		else if (skip_hwpoisoned_pages && PageHWPoison(page))
>  			/* A HWPoisoned page cannot be also PageBuddy */
>  			pfn++;
> +		else if (skip_reserved_pages && PageReserved(page))
> +			/* Skipping Reserved pages */
> +			pfn++;
>  		else
>  			break;
>  	}
> @@ -262,7 +266,7 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn,
>  
>  /* Caller should ensure that requested range is in a single zone */
>  int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn,
> -			bool skip_hwpoisoned_pages)
> +			bool skip_hwpoisoned_pages, bool skip_reserved_pages)
>  {
>  	unsigned long pfn, flags;
>  	struct page *page;
> @@ -285,7 +289,8 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn,
>  	zone = page_zone(page);
>  	spin_lock_irqsave(&zone->lock, flags);
>  	pfn = __test_page_isolated_in_pageblock(start_pfn, end_pfn,
> -						skip_hwpoisoned_pages);
> +						skip_hwpoisoned_pages,
> +						skip_reserved_pages);
>  	spin_unlock_irqrestore(&zone->lock, flags);
>  
>  	trace_test_pages_isolated(start_pfn, end_pfn, pfn);
> -- 
> 2.13.6
> 

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ