[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <af6a8c43-e286-5360-61f3-6d306d8f1951@oracle.com>
Date: Fri, 9 Aug 2019 14:00:21 -0700
From: Mike Kravetz <mike.kravetz@...cle.com>
To: Mina Almasry <almasrymina@...gle.com>
Cc: Michal Koutný <mkoutny@...e.com>,
shuah <shuah@...nel.org>, David Rientjes <rientjes@...gle.com>,
Shakeel Butt <shakeelb@...gle.com>,
Greg Thelen <gthelen@...gle.com>, akpm@...ux-foundation.org,
khalid.aziz@...cle.com, open list <linux-kernel@...r.kernel.org>,
linux-mm@...ck.org, linux-kselftest@...r.kernel.org,
cgroups@...r.kernel.org
Subject: Re: [RFC PATCH] hugetlbfs: Add hugetlb_cgroup reservation limits
On 8/9/19 1:57 PM, Mina Almasry wrote:
> On Fri, Aug 9, 2019 at 1:39 PM Mike Kravetz <mike.kravetz@...cle.com> wrote:
>>
>> On 8/9/19 11:05 AM, Mina Almasry wrote:
>>> On Fri, Aug 9, 2019 at 4:27 AM Michal Koutný <mkoutny@...e.com> wrote:
>>>>> Alternatives considered:
>>>>> [...]
>>>> (I did not try that but) have you considered:
>>>> 3) MAP_POPULATE while you're making the reservation,
>>>
>>> I have tried this, and the behaviour is not great. Basically if
>>> userspace mmaps more memory than its cgroup limit allows with
>>> MAP_POPULATE, the kernel will reserve the total amount requested by
>>> the userspace, it will fault in up to the cgroup limit, and then it
>>> will SIGBUS the task when it tries to access the rest of its
>>> 'reserved' memory.
>>>
>>> So for example:
>>> - if /proc/sys/vm/nr_hugepages == 10, and
>>> - your cgroup limit is 5 pages, and
>>> - you mmap(MAP_POPULATE) 7 pages.
>>>
>>> Then the kernel will reserve 7 pages, and will fault in 5 of those 7
>>> pages, and will SIGBUS you when you try to access the remaining 2
>>> pages. So the problem persists. Folks would still like to know they
>>> are crossing the limits on mmap time.
>>
>> If you got the failure at mmap time in the MAP_POPULATE case would this
>> be useful?
>>
>> Just thinking that would be a relatively simple change.
>
> Not quite, unfortunately. A subset of the folks that want to use
> hugetlb memory, don't want to use MAP_POPULATE (IIRC, something about
> mmaping a huge amount of hugetlb memory at their jobs' startup, and
> doing that with MAP_POPULATE adds so much to their startup time that
> it is prohibitively expensive - but that's just what I vaguely recall
> offhand. I can get you the details if you're interested).
Yes, MAP_POPULATE can get expensive as you will need to zero all those
huge pages.
--
Mike Kravetz
Powered by blists - more mailing lists