lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241031183413.bb0bc34e8354cc14cdfc3c29@linux-foundation.org>
Date: Thu, 31 Oct 2024 18:34:13 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Joshua Hahn <joshua.hahnjy@...il.com>
Cc: Michal Hocko <mhocko@...e.com>, Johannes Weiner <hannes@...xchg.org>,
 nphamcs@...il.com, shakeel.butt@...ux.dev, roman.gushchin@...ux.dev,
 muchun.song@...ux.dev, tj@...nel.org, lizefan.x@...edance.com,
 mkoutny@...e.com, corbet@....net, lnyng@...a.com, cgroups@...r.kernel.org,
 linux-mm@...ck.org, linux-doc@...r.kernel.org,
 linux-kernel@...r.kernel.org, kernel-team@...a.com
Subject: Re: [PATCH v3 1/1] memcg/hugetlb: Adding hugeTLB counters to memcg

On Thu, 31 Oct 2024 15:03:34 -0400 Joshua Hahn <joshua.hahnjy@...il.com> wrote:

> Andrew -- I am sorry to ask again, but do you think you can replace
> the 3rd section in the patch (3. Implementation Details) with the
> following paragraphs?

No problem.

: This patch introduces a new counter to memory.stat that tracks hugeTLB
: usage, only if hugeTLB accounting is done to memory.current.  This feature
: is enabled the same way hugeTLB accounting is enabled, via the
: memory_hugetlb_accounting mount flag for cgroupsv2.
: 
: 1. Why is this patch necessary?
: Currently, memcg hugeTLB accounting is an opt-in feature [1] that adds
: hugeTLB usage to memory.current.  However, the metric is not reported in
: memory.stat.  Given that users often interpret memory.stat as a breakdown
: of the value reported in memory.current, the disparity between the two
: reports can be confusing.  This patch solves this problem by including the
: metric in memory.stat as well, but only if it is also reported in
: memory.current (it would also be confusing if the value was reported in
: memory.stat, but not in memory.current)
: 
: Aside from the consistency between the two files, we also see benefits in
: observability.  Userspace might be interested in the hugeTLB footprint of
: cgroups for many reasons.  For instance, system admins might want to
: verify that hugeTLB usage is distributed as expected across tasks: i.e. 
: memory-intensive tasks are using more hugeTLB pages than tasks that don't
: consume a lot of memory, or are seen to fault frequently.  Note that this
: is separate from wanting to inspect the distribution for limiting purposes
: (in which case, hugeTLB controller makes more sense).
: 
: 2.  We already have a hugeTLB controller.  Why not use that?  It is true
: that hugeTLB tracks the exact value that we want.  In fact, by enabling
: the hugeTLB controller, we get all of the observability benefits that I
: mentioned above, and users can check the total hugeTLB usage, verify if it
: is distributed as expected, etc.
: 
: 3.  Implementation Details:
: In the alloc / free hugetlb functions, we call lruvec_stat_mod_folio
: regardless of whether memcg accounts hugetlb.  mem_cgroup_commit_charge
: which is called from alloc_hugetlb_folio will set memcg for the folio
: only if the CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING cgroup mount option is
: used, so lruvec_stat_mod_folio accounts per-memcg hugetlb counters only
: if the feature is enabled.  Regardless of whether memcg accounts for
: hugetlb, the newly added global counter is updated and shown in
: /proc/vmstat.
: 
: The global counter is added because vmstats is the preferred framework
: for cgroup stats.  It makes stat items consistent between global and
: cgroups.  It also provides a per-node breakdown, which is useful. 
: Because it does not use cgroup-specific hooks, we also keep generic MM
: code separate from memcg code.
: 
: With this said, there are 2 problems:
: (a) They are still not reported in memory.stat, which means the
:     disparity between the memcg reports are still there.
: (b) We cannot reasonably expect users to enable the hugeTLB controller
:     just for the sake of hugeTLB usage reporting, especially since
:     they don't have any use for hugeTLB usage enforcing [2].
: 
: [1] https://lore.kernel.org/all/20231006184629.155543-1-nphamcs@gmail.com/
: [2] Of course, we can't make a new patch for every feature that can be
:     duplicated. However, since the existing solution of enabling the
:     hugeTLB controller is an imperfect solution that still leaves a
:     discrepancy between memory.stat and memory.curent, I think that it
:     is reasonable to isolate the feature in this case.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ