[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izOv3GjKhnzVmksfH0U9xZ6OnC0R-XEZsqVxOvrJ5u_BBw@mail.gmail.com>
Date: Thu, 15 Aug 2019 16:04:30 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: Mike Kravetz <mike.kravetz@...cle.com>
Cc: shuah <shuah@...nel.org>, David Rientjes <rientjes@...gle.com>,
Shakeel Butt <shakeelb@...gle.com>,
Greg Thelen <gthelen@...gle.com>, akpm@...ux-foundation.org,
khalid.aziz@...cle.com, open list <linux-kernel@...r.kernel.org>,
linux-mm@...ck.org, linux-kselftest@...r.kernel.org
Subject: Re: [RFC PATCH v2 4/5] hugetlb_cgroup: Add accounting for shared mappings
On Wed, Aug 14, 2019 at 9:46 AM Mike Kravetz <mike.kravetz@...cle.com> wrote:
>
> On 8/13/19 4:54 PM, Mike Kravetz wrote:
> > On 8/8/19 4:13 PM, Mina Almasry wrote:
> >> For shared mappings, the pointer to the hugetlb_cgroup to uncharge lives
> >> in the resv_map entries, in file_region->reservation_counter.
> >>
> >> When a file_region entry is added to the resv_map via region_add, we
> >> also charge the appropriate hugetlb_cgroup and put the pointer to that
> >> in file_region->reservation_counter. This is slightly delicate since we
> >> need to not modify the resv_map until we know that charging the
> >> reservation has succeeded. If charging doesn't succeed, we report the
> >> error to the caller, so that the kernel fails the reservation.
> >
> > I wish we did not need to modify these region_() routines as they are
> > already difficult to understand. However, I see no other way with the
> > desired semantics.
> >
>
> I suspect you have considered this, but what about using the return value
> from region_chg() in hugetlb_reserve_pages() to charge reservation limits?
> There is a VERY SMALL race where the value could be too large, but that
> can be checked and adjusted at region_add time as is done with normal
> accounting today.
I have not actually until now; I didn't consider doing stuff with the
resv_map while not holding onto the resv_map->lock. I guess that's the
small race you're talking about. Seems fine to me, but I'm more
worried about hanging off the vma below.
> If the question is, where would we store the information
> to uncharge?, then we can hang a structure off the vma. This would be
> similar to what is done for private mappings. In fact, I would suggest
> making them both use a new cgroup reserve structure hanging off the vma.
>
I actually did consider hanging off the info to uncharge off the vma,
but I didn't for a couple of reasons:
1. region_del is called from hugetlb_unreserve_pages, and I don't have
access to the vma there. Maybe there is a way to query the proper vma
I don't know about?
2. hugetlb_reserve_pages seems to be able to conduct a reservation
with a NULL *vma. Not sure what to do in that case.
Is there a way to get around these that I'm missing here?
FWIW I think tracking is better in resv_map since the reservations are
in resv_map themselves. If I do another structure, then for each
reservation there will be an entry in resv_map and an entry in the new
structure and they need to be kept in sync and I need to handle errors
for when they get out of sync.
> One issue I see is what to do if a vma is split? The private mapping case
> 'should' handle this today, but I would not be surprised if such code is
> missing or incorrect.
>
> --
> Mike Kravetz
Powered by blists - more mailing lists