[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZyFFY65bGILq6GfQ@google.com>
Date: Tue, 29 Oct 2024 20:28:19 +0000
From: Roman Gushchin <roman.gushchin@...ux.dev>
To: Joshua Hahn <joshua.hahnjy@...il.com>
Cc: hannes@...xchg.org, nphamcs@...il.com, shakeel.butt@...ux.dev,
mhocko@...nel.org, muchun.song@...ux.dev, tj@...nel.org,
lizefan.x@...edance.com, mkoutny@...e.com, corbet@....net,
lnyng@...a.com, akpm@...ux-foundation.org, cgroups@...r.kernel.org,
linux-mm@...ck.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, kernel-team@...a.com
Subject: Re: [PATCH v3 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg
On Mon, Oct 28, 2024 at 02:05:05PM -0700, Joshua Hahn wrote:
> This patch introduces a new counter to memory.stat that tracks hugeTLB
> usage, only if hugeTLB accounting is done to memory.current. This
> feature is enabled the same way hugeTLB accounting is enabled, via
> the memory_hugetlb_accounting mount flag for cgroupsv2.
>
> 1. Why is this patch necessary?
> Currently, memcg hugeTLB accounting is an opt-in feature [1] that adds
> hugeTLB usage to memory.current. However, the metric is not reported in
> memory.stat. Given that users often interpret memory.stat as a breakdown
> of the value reported in memory.current, the disparity between the two
> reports can be confusing. This patch solves this problem by including
> the metric in memory.stat as well, but only if it is also reported in
> memory.current (it would also be confusing if the value was reported in
> memory.stat, but not in memory.current)
>
> Aside from the consistency between the two files, we also see benefits
> in observability. Userspace might be interested in the hugeTLB footprint
> of cgroups for many reasons. For instance, system admins might want to
> verify that hugeTLB usage is distributed as expected across tasks: i.e.
> memory-intensive tasks are using more hugeTLB pages than tasks that
> don't consume a lot of memory, or are seen to fault frequently. Note that
> this is separate from wanting to inspect the distribution for limiting
> purposes (in which case, hugeTLB controller makes more sense).
>
> 2. We already have a hugeTLB controller. Why not use that?
> It is true that hugeTLB tracks the exact value that we want. In fact, by
> enabling the hugeTLB controller, we get all of the observability
> benefits that I mentioned above, and users can check the total hugeTLB
> usage, verify if it is distributed as expected, etc.
>
> With this said, there are 2 problems:
> (a) They are still not reported in memory.stat, which means the
> disparity between the memcg reports are still there.
> (b) We cannot reasonably expect users to enable the hugeTLB controller
> just for the sake of hugeTLB usage reporting, especially since
> they don't have any use for hugeTLB usage enforcing [2].
>
> [1] https://lore.kernel.org/all/20231006184629.155543-1-nphamcs@gmail.com/
> [2] Of course, we can't make a new patch for every feature that can be
> duplicated. However, since the existing solution of enabling the
> hugeTLB controller is an imperfect solution that still leaves a
> discrepancy between memory.stat and memory.curent, I think that it
> is reasonable to isolate the feature in this case.
>
> Suggested-by: Nhat Pham <nphamcs@...il.com>
> Suggested-by: Shakeel Butt <shakeel.butt@...ux.dev>
> Signed-off-by: Joshua Hahn <joshua.hahnjy@...il.com>
Reviewed-by: Roman Gushchin <roman.gushchin@...ux.dev>
Thanks!
Powered by blists - more mailing lists