lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZRwQEv62Ex4+H2CZ@dhcp22.suse.cz>
Date:   Tue, 3 Oct 2023 14:58:58 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     Nhat Pham <nphamcs@...il.com>
Cc:     akpm@...ux-foundation.org, riel@...riel.com, hannes@...xchg.org,
        roman.gushchin@...ux.dev, shakeelb@...gle.com,
        muchun.song@...ux.dev, tj@...nel.org, lizefan.x@...edance.com,
        shuah@...nel.org, mike.kravetz@...cle.com, yosryahmed@...gle.com,
        fvdl@...gle.com, linux-mm@...ck.org, kernel-team@...a.com,
        linux-kernel@...r.kernel.org, cgroups@...r.kernel.org
Subject: Re: [PATCH v3 2/3] hugetlb: memcg: account hugetlb-backed memory in
 memory controller

On Mon 02-10-23 17:18:27, Nhat Pham wrote:
> Currently, hugetlb memory usage is not acounted for in the memory
> controller, which could lead to memory overprotection for cgroups with
> hugetlb-backed memory. This has been observed in our production system.
> 
> For instance, here is one of our usecases: suppose there are two 32G
> containers. The machine is booted with hugetlb_cma=6G, and each
> container may or may not use up to 3 gigantic page, depending on the
> workload within it. The rest is anon, cache, slab, etc. We can set the
> hugetlb cgroup limit of each cgroup to 3G to enforce hugetlb fairness.
> But it is very difficult to configure memory.max to keep overall
> consumption, including anon, cache, slab etc. fair.
> 
> What we have had to resort to is to constantly poll hugetlb usage and
> readjust memory.max. Similar procedure is done to other memory limits
> (memory.low for e.g). However, this is rather cumbersome and buggy.

Could you expand some more on how this _helps_ memory.low? The
hugetlb memory is not reclaimable so whatever portion of its memcg
consumption will be "protected from the reclaim". Consider this
	      parent
	/		\
       A		 B
       low=50%		 low=0
       current=40%	 current=60%

We have an external memory pressure and the reclaim should prefer B as A
is under its low limit, correct? But now consider that the predominant
consumption of B is hugetlb which would mean the memory reclaim cannot
do much for B and so the A's protection might be breached.

As an admin (or a tool) you need to know about hugetlb as a potential
contributor to this behavior (sure mlocked memory would behave the same
but mlock rarely consumes huge amount of memory in my experience).
Without the accounting there might not be any external pressure in the
first place. 

All that being said, I do not see how adding hugetlb into accounting
makes low, min limits management any easier.

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ