lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130701091355.GA14444@gchen.bj.intel.com>
Date:	Mon, 1 Jul 2013 05:13:55 -0400
From:	Chen Gong <gong.chen@...ux.intel.com>
To:	Naoya Horiguchi <n-horiguchi@...jp.nec.com>
Cc:	linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
	Andi Kleen <andi@...stfloor.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/memory-failure.c: fix memory leak in successful soft
 offlining

On Fri, May 17, 2013 at 12:18:02PM -0400, Naoya Horiguchi wrote:
> Date: Fri, 17 May 2013 12:18:02 -0400
> From: Naoya Horiguchi <n-horiguchi@...jp.nec.com>
> To: linux-mm@...ck.org
> Cc: Andrew Morton <akpm@...ux-foundation.org>, Andi Kleen
>  <andi@...stfloor.org>, linux-kernel@...r.kernel.org, Naoya Horiguchi
>  <n-horiguchi@...jp.nec.com>
> Subject: [PATCH] mm/memory-failure.c: fix memory leak in successful soft
>  offlining
> 
> After a successful page migration by soft offlining, the source page is
> not properly freed and it's never reusable even if we unpoison it afterward.
> 
> This is caused by the race between freeing page and setting PG_hwpoison.
> In successful soft offlining, the source page is put (and the refcount
> becomes 0) by putback_lru_page() in unmap_and_move(), where it's linked to
> pagevec and actual freeing back to buddy is delayed. So if PG_hwpoison is
> set for the page before freeing, the freeing does not functions as expected
> (in such case freeing aborts in free_pages_prepare() check.)
> 
> This patch tries to make sure to free the source page before setting
> PG_hwpoison on it. To avoid reallocating, the page keeps MIGRATE_ISOLATE
> until after setting PG_hwpoison.
> 
> This patch also removes obsolete comments about "keeping elevated refcount"
> because what they say is not true. Unlike memory_failure(), soft_offline_page()
> uses no special page isolation code, and the soft-offlined pages have no
> difference from buddy pages except PG_hwpoison. So no need to keep refcount
> elevated.
> 
> Signed-off-by: Naoya Horiguchi <n-horiguchi@...jp.nec.com>
> ---
>  mm/memory-failure.c | 22 ++++++++++++++++++----
>  1 file changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git linux-v3.9-rc3.orig/mm/memory-failure.c linux-v3.9-rc3/mm/memory-failure.c
> index 4e01082..894262d 100644
> --- linux-v3.9-rc3.orig/mm/memory-failure.c
> +++ linux-v3.9-rc3/mm/memory-failure.c
> @@ -1410,7 +1410,8 @@ static int __get_any_page(struct page *p, unsigned long pfn, int flags)
>  
>  	/*
>  	 * Isolate the page, so that it doesn't get reallocated if it
> -	 * was free.
> +	 * was free. This flag should be kept set until the source page
> +	 * is freed and PG_hwpoison on it is set.
>  	 */
>  	set_migratetype_isolate(p, true);
>  	/*
> @@ -1433,7 +1434,6 @@ static int __get_any_page(struct page *p, unsigned long pfn, int flags)
>  		/* Not a free page */
>  		ret = 1;
>  	}
> -	unset_migratetype_isolate(p, MIGRATE_MOVABLE);
>  	unlock_memory_hotplug();
>  	return ret;
>  }
> @@ -1503,7 +1503,6 @@ static int soft_offline_huge_page(struct page *page, int flags)
>  		atomic_long_add(1 << compound_trans_order(hpage),
>  				&num_poisoned_pages);
>  	}
> -	/* keep elevated page count for bad page */
>  	return ret;
>  }
>  
> @@ -1568,7 +1567,7 @@ int soft_offline_page(struct page *page, int flags)
>  			atomic_long_inc(&num_poisoned_pages);
>  		}
>  	}
> -	/* keep elevated page count for bad page */
> +	unset_migratetype_isolate(page, MIGRATE_MOVABLE);
>  	return ret;
>  }
>  
> @@ -1634,7 +1633,22 @@ static int __soft_offline_page(struct page *page, int flags)
>  			if (ret > 0)
>  				ret = -EIO;
>  		} else {
> +			/*
> +			 * After page migration succeeds, the source page can
> +			 * be trapped in pagevec and actual freeing is delayed.
> +			 * Freeing code works differently based on PG_hwpoison,
> +			 * so there's a race. We need to make sure that the
> +			 * source page should be freed back to buddy before
> +			 * setting PG_hwpoison.
> +			 */
> +			if (!is_free_buddy_page(page))
> +				lru_add_drain_all();
> +			if (!is_free_buddy_page(page))
> +				drain_all_pages();
>  			SetPageHWPoison(page);
> +			if (!is_free_buddy_page(page))
> +				pr_info("soft offline: %#lx: page leaked\n",
> +					pfn);
>  			atomic_long_inc(&num_poisoned_pages);
>  		}
>  	} else {
> -- 
> 1.7.11.7
> 
Hi, Naoya

What happens about this patch? It looks find to me but not merged yet.
If something I missed, would you please tell me again?

Download attachment "signature.asc" of type "application/pgp-signature" (837 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ