[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <57FF7BB4.1070202@redhat.com>
Date: Thu, 13 Oct 2016 14:19:00 +0200
From: Jan Stancek <jstancek@...hat.com>
To: linux-mm@...ck.org, linux-kernel@...r.kernel.org
Cc: mike.kravetz@...cle.com, hillf.zj@...baba-inc.com,
dave.hansen@...ux.intel.com, kirill.shutemov@...ux.intel.com,
mhocko@...e.cz, n-horiguchi@...jp.nec.com,
aneesh.kumar@...ux.vnet.ibm.com, iamjoonsoo.kim@....com
Subject: [bug/regression] libhugetlbfs testsuite failures and OOMs eventually
kill my system
Hi,
I'm running into ENOMEM failures with libhugetlbfs testsuite [1] on
a power8 lpar system running 4.8 or latest git [2]. Repeated runs of
this suite trigger multiple OOMs, that eventually kill entire system,
it usually takes 3-5 runs:
* Total System Memory......: 18024 MB
* Shared Mem Max Mapping...: 320 MB
* System Huge Page Size....: 16 MB
* Available Huge Pages.....: 20
* Total size of Huge Pages.: 320 MB
* Remaining System Memory..: 17704 MB
* Huge Page User Group.....: hugepages (1001)
I see this only on ppc (BE/LE), x86_64 seems unaffected and successfully
ran the tests for ~12 hours.
Bisect has identified following patch as culprit:
commit 67961f9db8c477026ea20ce05761bde6f8bf85b0
Author: Mike Kravetz <mike.kravetz@...cle.com>
Date: Wed Jun 8 15:33:42 2016 -0700
mm/hugetlb: fix huge page reserve accounting for private mappings
Following patch (made with my limited insight) applied to
latest git [2] fixes the problem for me:
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ec49d9e..7261583 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1876,7 +1876,7 @@ static long __vma_reservation_common(struct hstate *h,
* return value of this routine is the opposite of the
* value returned from reserve map manipulation routines above.
*/
- if (ret)
+ if (ret >= 0)
return 0;
else
return 1;
Regards,
Jan
[1] https://github.com/libhugetlbfs/libhugetlbfs
[2] v4.8-14230-gb67be92
Powered by blists - more mailing lists