lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 2 Oct 2023 10:50:26 -0400
From:   Johannes Weiner <hannes@...xchg.org>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Nhat Pham <nphamcs@...il.com>, akpm@...ux-foundation.org,
        riel@...riel.com, roman.gushchin@...ux.dev, shakeelb@...gle.com,
        muchun.song@...ux.dev, tj@...nel.org, lizefan.x@...edance.com,
        shuah@...nel.org, mike.kravetz@...cle.com, yosryahmed@...gle.com,
        linux-mm@...ck.org, kernel-team@...a.com,
        linux-kernel@...r.kernel.org, cgroups@...r.kernel.org
Subject: Re: [PATCH v2 1/2] hugetlb: memcg: account hugetlb-backed memory in
 memory controller

On Mon, Oct 02, 2023 at 03:43:19PM +0200, Michal Hocko wrote:
> On Wed 27-09-23 17:57:22, Nhat Pham wrote:
> > Currently, hugetlb memory usage is not acounted for in the memory
> > controller, which could lead to memory overprotection for cgroups with
> > hugetlb-backed memory. This has been observed in our production system.
> > 
> > This patch rectifies this issue by charging the memcg when the hugetlb
> > folio is allocated, and uncharging when the folio is freed (analogous to
> > the hugetlb controller).
> 
> This changelog is missing a lot of information. Both about the usecase
> (we do not want to fish that out from archives in the future) and the
> actual implementation and the reasoning behind that.
> 
> AFAICS you have decided to charge on the hugetlb use rather than hugetlb
> allocation to the pool. I suspect the underlying reasoning is that pool
> pages do not belong to anybody. This is a deliberate decision and it
> should be documented as such.
> 
> It is also very important do describe subtle behavior properties that
> might be rather unintuitive to users. Most notably 
> - there is no hugetlb pool management involved in the memcg
>   controller. One has to use hugetlb controller for that purpose.
>   Also the pre allocated pool as such doesn't belong to anybody so the
>   memcg host overcommit management has to consider it when configuring
>   hard limits.

+1

> - memcg limit reclaim doesn't assist hugetlb pages allocation when
>   hugetlb overcommit is configured (i.e. pages are not consumed from the
>   pool) which means that the page allocation might disrupt workloads
>   from other memcgs.
> - failure to charge a hugetlb page results in SIGBUS rather
>   than memcg oom killer. That could be the case even if the
>   hugetlb pool still has pages available and there is
>   reclaimable memory in the memcg.

Are these actually true? AFAICS, regardless of whether the page comes
from the pool or the buddy allocator, the memcg code will go through
the regular charge path, attempt reclaim, and OOM if that fails.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ