[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <diqz7c78rob7.fsf@ackerleytng-ctop.c.googlers.com>
Date: Mon, 06 Jan 2025 14:48:12 +0000
From: Ackerley Tng <ackerleytng@...gle.com>
To: Peter Xu <peterx@...hat.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org, riel@...riel.com,
leitao@...ian.org, akpm@...ux-foundation.org, muchun.song@...ux.dev,
osalvador@...e.de, roman.gushchin@...ux.dev, nao.horiguchi@...il.com
Subject: Re: [PATCH 4/7] mm/hugetlb: Clean up map/global resv accounting when allocate
Peter Xu <peterx@...hat.com> writes:
> On Sat, Dec 28, 2024 at 12:06:34AM +0000, Ackerley Tng wrote:
>> >
>> > - /* If this allocation is not consuming a reservation, charge it now.
>> > + /*
>> > + * If this allocation is not consuming a per-vma reservation,
>> > + * charge the hugetlb cgroup now.
>> > */
>> > - deferred_reserve = map_chg || cow_from_owner;
>> > - if (deferred_reserve) {
>> > + if (map_chg) {
>> > ret = hugetlb_cgroup_charge_cgroup_rsvd(
>> > idx, pages_per_huge_page(h), &h_cg);
>>
>> Should hugetlb_cgroup_charge_cgroup_rsvd() be called when map_chg == MAP_CHG_ENFORCED?
>
> This looks like a pretty niche use case, though I would say yes.
>
> I don't think I take a lot of consideration here when drafting the patch,
> as the change here should have kept the old behavior: map_chg grows into
> the tristate so that we can drop deferred_reserve, OTOH nothing should
> change from such behavior of cgroup charging.
>
> When it happens, it means the owner process CoWed a private hugetlb folio
> which will enforce bypassing the vma reservation. Here bypassing the vma
> check makes sense to me, because the new to-be-cowed folio X will replace
> another folio Y, which should have consumed the private vma resv at this
> specific index. So there's no way the to-be-cowed folio X can have anything
> to do with the vma reservation..
>
> Besides the vma reservation, I don't see why this folio allocation needs to
> be any more special. IOW, it should still go through all rest checks and
> fail the process properly if the check fails, that should include any form
> of cgroups (either hugetlb or memcg), IMHO.
>
> Do you have any specific thought on this path?
I re-read the code, and I hope this understanding is right:
When a user sets "rsvd.max_usage_in_bytes" to X, the user is saying that
within this cgroup, the maximum memory that can be reserved in the vma
reservation is X.
Hence even when this CoW is performed, this should count towards the
cgroup's "rsvd.max_usage_in_bytes" and so yes, it should be charged.
I think I misunderstood the context on cgroup charging earlier and hence
I thought it shouldn't be charged, but I agree with you after
re-reading.
>
> Thanks,
Powered by blists - more mailing lists