[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9872cec9-a0fe-cfe0-0df6-90b6dd909f04@oracle.com>
Date: Wed, 14 Aug 2019 09:46:47 -0700
From: Mike Kravetz <mike.kravetz@...cle.com>
To: Mina Almasry <almasrymina@...gle.com>
Cc: shuah@...nel.org, rientjes@...gle.com, shakeelb@...gle.com,
gthelen@...gle.com, akpm@...ux-foundation.org,
khalid.aziz@...cle.com, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linux-kselftest@...r.kernel.org
Subject: Re: [RFC PATCH v2 4/5] hugetlb_cgroup: Add accounting for shared
mappings
On 8/13/19 4:54 PM, Mike Kravetz wrote:
> On 8/8/19 4:13 PM, Mina Almasry wrote:
>> For shared mappings, the pointer to the hugetlb_cgroup to uncharge lives
>> in the resv_map entries, in file_region->reservation_counter.
>>
>> When a file_region entry is added to the resv_map via region_add, we
>> also charge the appropriate hugetlb_cgroup and put the pointer to that
>> in file_region->reservation_counter. This is slightly delicate since we
>> need to not modify the resv_map until we know that charging the
>> reservation has succeeded. If charging doesn't succeed, we report the
>> error to the caller, so that the kernel fails the reservation.
>
> I wish we did not need to modify these region_() routines as they are
> already difficult to understand. However, I see no other way with the
> desired semantics.
>
I suspect you have considered this, but what about using the return value
from region_chg() in hugetlb_reserve_pages() to charge reservation limits?
There is a VERY SMALL race where the value could be too large, but that
can be checked and adjusted at region_add time as is done with normal
accounting today. If the question is, where would we store the information
to uncharge?, then we can hang a structure off the vma. This would be
similar to what is done for private mappings. In fact, I would suggest
making them both use a new cgroup reserve structure hanging off the vma.
One issue I see is what to do if a vma is split? The private mapping case
'should' handle this today, but I would not be surprised if such code is
missing or incorrect.
--
Mike Kravetz
Powered by blists - more mailing lists