lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <57FF7BB4.1070202@redhat.com>
Date:   Thu, 13 Oct 2016 14:19:00 +0200
From:   Jan Stancek <jstancek@...hat.com>
To:     linux-mm@...ck.org, linux-kernel@...r.kernel.org
Cc:     mike.kravetz@...cle.com, hillf.zj@...baba-inc.com,
        dave.hansen@...ux.intel.com, kirill.shutemov@...ux.intel.com,
        mhocko@...e.cz, n-horiguchi@...jp.nec.com,
        aneesh.kumar@...ux.vnet.ibm.com, iamjoonsoo.kim@....com
Subject: [bug/regression] libhugetlbfs testsuite failures and OOMs eventually
 kill my system

Hi,

I'm running into ENOMEM failures with libhugetlbfs testsuite [1] on
a power8 lpar system running 4.8 or latest git [2]. Repeated runs of
this suite trigger multiple OOMs, that eventually kill entire system,
it usually takes 3-5 runs:

 * Total System Memory......:  18024 MB
 * Shared Mem Max Mapping...:    320 MB
 * System Huge Page Size....:     16 MB
 * Available Huge Pages.....:     20
 * Total size of Huge Pages.:    320 MB
 * Remaining System Memory..:  17704 MB
 * Huge Page User Group.....:  hugepages (1001)

I see this only on ppc (BE/LE), x86_64 seems unaffected and successfully
ran the tests for ~12 hours.

Bisect has identified following patch as culprit:
  commit 67961f9db8c477026ea20ce05761bde6f8bf85b0
  Author: Mike Kravetz <mike.kravetz@...cle.com>
  Date:   Wed Jun 8 15:33:42 2016 -0700
    mm/hugetlb: fix huge page reserve accounting for private mappings


Following patch (made with my limited insight) applied to
latest git [2] fixes the problem for me:

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ec49d9e..7261583 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1876,7 +1876,7 @@ static long __vma_reservation_common(struct hstate *h,
                 * return value of this routine is the opposite of the
                 * value returned from reserve map manipulation routines above.
                 */
-               if (ret)
+               if (ret >= 0)
                        return 0;
                else
                        return 1;

Regards,
Jan

[1] https://github.com/libhugetlbfs/libhugetlbfs
[2] v4.8-14230-gb67be92

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ