linux-kernel - Re: [PATCH 3/3] hugetlbfs: don't retry when pool page allocations start to fail

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <dfb0f20f-7a2d-3228-5c0d-9da4793f575c@oracle.com>
Date:   Mon, 5 Aug 2019 10:12:00 -0700
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Vlastimil Babka <vbabka@...e.cz>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Cc:     Hillf Danton <hdanton@...a.com>, Michal Hocko <mhocko@...nel.org>,
        Mel Gorman <mgorman@...e.de>,
        Johannes Weiner <hannes@...xchg.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        David Rientjes <rientjes@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 3/3] hugetlbfs: don't retry when pool page allocations
 start to fail

On 8/5/19 2:28 AM, Vlastimil Babka wrote:
> On 8/3/19 12:39 AM, Mike Kravetz wrote:
>> When allocating hugetlbfs pool pages via /proc/sys/vm/nr_hugepages,
>> the pages will be interleaved between all nodes of the system.  If
>> nodes are not equal, it is quite possible for one node to fill up
>> before the others.  When this happens, the code still attempts to
>> allocate pages from the full node.  This results in calls to direct
>> reclaim and compaction which slow things down considerably.
>>
>> When allocating pool pages, note the state of the previous allocation
>> for each node.  If previous allocation failed, do not use the
>> aggressive retry algorithm on successive attempts.  The allocation
>> will still succeed if there is memory available, but it will not try
>> as hard to free up memory.
>>
>> Signed-off-by: Mike Kravetz <mike.kravetz@...cle.com>
> 
> Looks like only part of the (agreed with) suggestions were implemented?

My bad, I pulled in the wrong patch.

> - set_max_huge_pages() returns -ENOMEM if nodemask can't be allocated,
> but hugetlb_hstate_alloc_pages() doesn't.

That is somewhat intentional.  The calling context of the two routines is
significantly different.   hugetlb_hstate_alloc_pages is called at boot time
to handle command line parameters.  And, hugetlb_hstate_alloc_pages does not
return a value as it is of type void.

We 'could' print out a warning here.  But, if we can't allocate a node mask
I am pretty sure we will not be able to boot.  I will add a comment.

> - there's still __GFP_NORETRY in nodemask allocations
> - (cosmetics) Mel pointed out that NODEMASK_FREE() works fine with NULL
> pointers

-- 
Mike Kravetz