linux-kernel - Re: [RFC PATCH] mm,memory_hotplug: Unlock 1GB-hugetlb on x86

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <bb71b68e-dc1b-a4d3-d842-b311535b92a8@redhat.com>
Date:   Thu, 28 Feb 2019 08:38:34 +0100
From:   David Hildenbrand <david@...hat.com>
To:     Mike Kravetz <mike.kravetz@...cle.com>,
        Oscar Salvador <osalvador@...e.de>, linux-mm@...ck.org
Cc:     linux-kernel@...r.kernel.org, mhocko@...e.com
Subject: Re: [RFC PATCH] mm,memory_hotplug: Unlock 1GB-hugetlb on x86_64

On 27.02.19 23:00, Mike Kravetz wrote:
> On 2/27/19 1:51 PM, Oscar Salvador wrote:
>> On Thu, Feb 21, 2019 at 10:42:12AM +0100, Oscar Salvador wrote:
>>> [1] https://lore.kernel.org/patchwork/patch/998796/
>>>
>>> Signed-off-by: Oscar Salvador <osalvador@...e.de>
>>
>> Any further comments on this?
>> I do have a "concern" I would like to sort out before dropping the RFC:
>>
>> It is the fact that unless we have spare gigantic pages in other notes, the
>> offlining operation will loop forever (until the customer cancels the operation).
>> While I do not really like that, I do think that memory offlining should be done
>> with some sanity, and the administrator should know in advance if the system is going
>> to be able to keep up with the memory pressure, aka: make sure we got what we need in
>> order to make the offlining operation to succeed.
>> That translates to be sure that we have spare gigantic pages and other nodes
>> can take them.
>>
>> Given said that, another thing I thought about is that we could check if we have
>> spare gigantic pages at has_unmovable_pages() time.
>> Something like checking "h->free_huge_pages - h->resv_huge_pages > 0", and if it
>> turns out that we do not have gigantic pages anywhere, just return as we have
>> non-movable pages.
> 
> Of course, that check would be racy.  Even if there is an available gigantic
> page at has_unmovable_pages() time there is no guarantee it will be there when
> we want to allocate/use it.  But, you would at least catch 'most' cases of
> looping forever.
> 
>> But I would rather not convulate has_unmovable_pages() with such checks and "trust"
>> the administrator.

I think we have the exact same issue already with huge/ordinary pages if
we are low on memory. We could loop forever.

In the long run, we should properly detect such issues and abort instead
of looping forever I guess. But as we all know, error handling in the
whole offlining part is still far away from being perfect ...

-- 

Thanks,

David / dhildenb