linux-kernel - Re: [RFC Patch 1/1] mm/hugetlb: Clarify OOM message on size of hugetlb and requested hugepages total

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20170913155204.w75sgaosyqi6it57@oracle.com>
Date:   Wed, 13 Sep 2017 11:52:05 -0400
From:   "Liam R. Howlett" <Liam.Howlett@...cle.com>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Gerald Schaefer <gerald.schaefer@...ibm.com>,
        zhong jiang <zhongjiang@...wei.com>,
        Hillf Danton <hillf.zj@...baba-inc.com>,
        "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC Patch 1/1] mm/hugetlb: Clarify OOM message on size of
 hugetlb and requested hugepages total

* Michal Hocko <mhocko@...nel.org> [170913 08:43]:
> On Mon 11-09-17 11:48:20, Liam R. Howlett wrote:
> > Change the output of hugetlb_show_meminfo to give the size of the
> > hugetlb in more than just Kb and add a warning message if the requested
> > hugepages is larger than the allocated hugepages.  The warning message
> > for very badly configured hugepages has been removed in favour of this
> > method.
> > 
> > The new messages look like this:
> > ----
> > Node 0 hugepages_total=1 hugepages_free=1 hugepages_surp=0
> > hugepages_size=1.00 GiB
> > 
> > Node 0 hugepages_total=1326 hugepages_free=1326 hugepages_surp=0
> > hugepages_size=2.00 MiB
> > 
> > hugepage_size 1.00 GiB: Requested 5 hugepages (5.00 GiB) but 1 hugepages
> > (1.00 GiB) were allocated.
> > 
> > hugepage_size 2.00 MiB: Requested 4000 hugepages (7.81 GiB) but 1326
> > hugepages (2.59 GiB) were allocated.
> > ----
> > 
> > The old messages look like this:
> > ----
> > Node 0 hugepages_total=1 hugepages_free=1 hugepages_surp=0
> > hugepages_size=1048576kB
> > 
> > Node 0 hugepages_total=1435 hugepages_free=1435 hugepages_surp=0
> > hugepages_size=2048kB
> > ----
> > 
> > Signed-off-by: Liam R. Howlett <Liam.Howlett@...cle.com>
> 
> To be honest, I really dislike this. It doesn't really add anything
> really new to the OOM report. We already know how much memory is
> unreclaimable because it is reserved for hugetlb usage. Why does the
> requested size make any difference? We could fail to allocate requested
> number of pages because of memory pressure or fragmentation without any
> sign of misconfiguration.

Okay, thanks.  I was trying to address the issues you had with the
previous logging addition.

I understand that the OOM report is clear to many, but I thought it
would be more clear if the hugepage size was printed in a human readable
format instead of KB, especially with platforms supporting a lot of
huge page sizes and we already use the formatting elsewhere.

My thoughts for the requested size was to expose the failure to allocate
a resource which currently doesn't have any reporting back to the user -
except on boot failures, which you also disliked.  I thought reporting
in the OOM message would be less of a change than reporting at
allocation time and it would be more clear what happened on poorly
configured systems as the failure would be printed closer to the panic.

> 
> Also req_max_huge_pages would have to be per NUMA node othwerise you are
> just losing information when allocation hugetlb pages via sysfs per node
> interface.
> 

Thank you for your thorough review and time,
Liam