[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171020023019.GA9318@hori1.linux.bs1.fc.nec.co.jp>
Date: Fri, 20 Oct 2017 02:30:20 +0000
From: Naoya Horiguchi <n-horiguchi@...jp.nec.com>
To: Mike Kravetz <mike.kravetz@...cle.com>
CC: "linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Michal Hocko <mhocko@...nel.org>,
"Aneesh Kumar" <aneesh.kumar@...ux.vnet.ibm.com>,
Anshuman Khandual <khandual@...ux.vnet.ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>,
"stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH 1/1] mm:hugetlbfs: Fix hwpoison reserve accounting
On Thu, Oct 19, 2017 at 04:00:07PM -0700, Mike Kravetz wrote:
> Calling madvise(MADV_HWPOISON) on a hugetlbfs page will result in
> bad (negative) reserved huge page counts. This may not happen
> immediately, but may happen later when the underlying file is
> removed or filesystem unmounted. For example:
> AnonHugePages: 0 kB
> ShmemHugePages: 0 kB
> HugePages_Total: 1
> HugePages_Free: 0
> HugePages_Rsvd: 18446744073709551615
> HugePages_Surp: 0
> Hugepagesize: 2048 kB
>
> In routine hugetlbfs_error_remove_page(), hugetlb_fix_reserve_counts
> is called after remove_huge_page. hugetlb_fix_reserve_counts is
> designed to only be called/used only if a failure is returned from
> hugetlb_unreserve_pages. Therefore, call hugetlb_unreserve_pages
> as required and only call hugetlb_fix_reserve_counts in the unlikely
> event that hugetlb_unreserve_pages returns an error.
Hi Mike,
Thank you for addressing this. The patch itself looks good to me, but
the reported issue (negative reserve count) doesn't reproduce in my trial
with v4.14-rc5, so could you share the exact procedure for this issue?
When error handler runs over a huge page, the reserve count is incremented
so I'm not sure why the reserve count goes negative. My operation is like below:
$ sysctl vm.nr_hugepages=10
$ grep HugePages_ /proc/meminfo
HugePages_Total: 10
HugePages_Free: 10
HugePages_Rsvd: 0
HugePages_Surp: 0
$ ./test_alloc_generic -B hugetlb_file -N1 -L "mmap access memory_error_injection:error_type=madv_hard" // allocate a 2MB file on hugetlbfs, then madvise(MADV_HWPOISON) on it.
$ grep HugePages_ /proc/meminfo
HugePages_Total: 10
HugePages_Free: 9
HugePages_Rsvd: 1 // reserve count is incremented
HugePages_Surp: 0
$ rm work/hugetlbfs/testfile
$ grep HugePages_ /proc/meminfo
HugePages_Total: 10
HugePages_Free: 9
HugePages_Rsvd: 0 // reserve count is gone
HugePages_Surp: 0
$ /src/linux-dev/tools/vm/page-types -b hwpoison -x // unpoison the huge page
$ grep HugePages_ /proc/meminfo
HugePages_Total: 10
HugePages_Free: 10 // all huge pages are free (back to the beginning)
HugePages_Rsvd: 0
HugePages_Surp: 0
Thanks,
Naoya Horiguchi
>
> Fixes: 78bb920344b8 ("mm: hwpoison: dissolve in-use hugepage in unrecoverable memory error")
> Cc: Naoya Horiguchi <n-horiguchi@...jp.nec.com>
> Cc: Michal Hocko <mhocko@...nel.org>
> Cc: Aneesh Kumar <aneesh.kumar@...ux.vnet.ibm.com>
> Cc: Anshuman Khandual <khandual@...ux.vnet.ibm.com>
> Cc: Andrew Morton <akpm@...ux-foundation.org>
> Cc: <stable@...r.kernel.org>
> Signed-off-by: Mike Kravetz <mike.kravetz@...cle.com>
> ---
> fs/hugetlbfs/inode.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 59073e9f01a4..ed113ea17aff 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -842,9 +842,12 @@ static int hugetlbfs_error_remove_page(struct address_space *mapping,
> struct page *page)
> {
> struct inode *inode = mapping->host;
> + pgoff_t index = page->index;
>
> remove_huge_page(page);
> - hugetlb_fix_reserve_counts(inode);
> + if (unlikely(hugetlb_unreserve_pages(inode, index, index + 1, 1)))
> + hugetlb_fix_reserve_counts(inode);
> +
> return 0;
> }
>
> --
> 2.13.6
>
>
Powered by blists - more mailing lists