lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <jllnyxah2czkca4bpbaqshksdjqk7lapgviee6gyajlqx3pcon@qwrf5ooxzrim>
Date: Sat, 18 May 2024 00:32:53 -0700
From: Shakeel Butt <shakeel.butt@...ux.dev>
To: Yafang Shao <laoar.shao@...il.com>
Cc: Yosry Ahmed <yosryahmed@...gle.com>, 
	Roman Gushchin <roman.gushchin@...ux.dev>, Andrew Morton <akpm@...ux-foundation.org>, 
	Muchun Song <muchun.song@...ux.dev>, Johannes Weiner <hannes@...xchg.org>, 
	Michal Hocko <mhocko@...nel.org>, Matthew Wilcox <willy@...radead.org>, linux-mm@...ck.org, 
	linux-kernel@...r.kernel.org, gthelen@...gle.com, rientjes@...gle.com, 
	Chris Li <chrisl@...nel.org>, Ivan Babrou <ivan@...udflare.com>
Subject: Re: [PATCH rfc 0/9] mm: memcg: separate legacy cgroup v1 code and
 put under config option

On Thu, May 16, 2024 at 11:35:57AM +0800, Yafang Shao wrote:
> On Thu, May 9, 2024 at 2:33 PM Shakeel Butt <shakeel.butt@...ux.dev> wrote:
> >
> 
[...]
> Hi Shakeel,
> 
> Hopefully I'm not too late.  We are currently using memcg v1.
> 
> One specific feature we rely on in v1 is skmem accounting. In v1, we
> account for TCP memory usage without charging it to memcg v1, which is
> useful for monitoring the TCP memory usage generated by tasks running
> in a container. However, in memcg v2, monitoring TCP memory requires
> charging it to the container, which can easily cause OOM issues. It
> would be better if we could monitor skmem usage without charging it in
> the memcg v2, allowing us to account for it without the risk of
> triggering OOM conditions.
> 

Hi Yafang,

No worries. From what I understand, you are not really using skmem
charging of v1 but just the network memory usage stats and you are
worried that charging network memory to cgroup memory may cause OOMs. Is
that correct? Have you tried charging network memory to cgroup memory
before and saw OOMs? If yes then I would really like to see OOM reports.

I have two examples where the v2's skmem charging is working fine in
production namely Google and Meta. Google is still on v1 but for skmem
charging, they have moved to v2 semantics. Actually I have another
report from Cloudflare [0] where the tcp throttling mechanism for v2's
tcp memory accounting is too much conservative for their production
traffic.

Anyways this just means that we need a more flexible way to provide
and enforce semantics for tcp memory pressure with a decent default
behavior. I will followup on this separately.

[0] https://lore.kernel.org/lkml/CABWYdi0G7cyNFbndM-ELTDAR3x4Ngm0AehEp5aP0tfNkXUE+Uw@mail.gmail.com/

thanks,
Shakeel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ