lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <vsr4khfsp4unk73a75ky7i35nzdjqsbodyeeuxipu3arormfjr@awi2srdwawfu>
Date: Tue, 23 Dec 2025 15:20:47 -0800
From: Shakeel Butt <shakeel.butt@...ux.dev>
To: Yosry Ahmed <yosry.ahmed@...ux.dev>
Cc: Qi Zheng <qi.zheng@...ux.dev>, hannes@...xchg.org, hughd@...gle.com, 
	mhocko@...e.com, roman.gushchin@...ux.dev, muchun.song@...ux.dev, 
	david@...nel.org, lorenzo.stoakes@...cle.com, ziy@...dia.com, harry.yoo@...cle.com, 
	imran.f.khan@...cle.com, kamalesh.babulal@...cle.com, axelrasmussen@...gle.com, 
	yuanchu@...gle.com, weixugc@...gle.com, chenridong@...weicloud.com, mkoutny@...e.com, 
	akpm@...ux-foundation.org, hamzamahfooz@...ux.microsoft.com, apais@...ux.microsoft.com, 
	lance.yang@...ux.dev, linux-mm@...ck.org, linux-kernel@...r.kernel.org, 
	cgroups@...r.kernel.org, Qi Zheng <zhengqi.arch@...edance.com>
Subject: Re: [PATCH v2 00/28] Eliminate Dying Memory Cgroup

On Tue, Dec 23, 2025 at 08:04:50PM +0000, Yosry Ahmed wrote:
[...]
> 
> I think there might be a problem with non-hierarchical stats on cgroup
> v1, I brought it up previously [*]. I am not sure if this was addressed
> but I couldn't immediately find anything.

Sigh, the curse of memcg-v1. Let's see what we can do to not break v1.

> 
> In short, if memory is charged to a dying cgroup 

Not sure why stats updates for dying cgroup is related. Isn't it simply
stat increase at the child memcg and then stat decrease at the parent
memcg would possibly show negative stat_local of the parent.

> at the time of
> reparenting, when the memory gets uncharged the stats updates will occur
> at the parent. This will update both hierarchical and non-hierarchical
> stats of the parent, which would corrupt the parent's non-hierarchical
> stats (because those counters were never incremented when the memory was
> charged).
> 
> I didn't track down which stats are affected by this, but off the top of
> my head I think all stats tracking anon, file, etc.

Let's start with what specific stats might be effected. First the stats
which are monotonically increasing should be fine, like
WORKINGSET_REFAULT_[ANON|FILE], PGPG[IN|OUT], PG[MAJ]FAULT.

So, the following ones are the interesting ones:

NR_FILE_PAGES, NR_ANON_MAPPED, NR_ANON_THPS, NR_SHMEM, NR_FILE_MAPPED,
NR_FILE_DIRTY, NR_WRITEBACK, MEMCG_SWAP, NR_SWAPCACHE.

> 
> The obvious solution is to flush and reparent the stats of a dying memcg
> during reparenting,

Again not sure how flushing will help here and what do you mean by
'reparent the stats'? Do you mean something like:

parent->vmstats->state_local += child->vmstats->state_local;

Hmm this seems fine and I think it should work.

> but I don't think this entirely fixes the problem
> because the dying memcg stats can still be updated after its reparenting
> (e.g. if a ref to the memcg has been held since before reparenting).

How can dying memcg stats can still be updated after reparenting? The
stats which we care about are the anon & file memory and this series is
reparenting them, so dying memcg will not see stats updates unless there
is a concurrent update happening and I think it is very easy to avoid
such situation by putting a grace period between reparenting the
file/anon folios and reparenting dying chils'd stats_local. Am I missing
something?


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ