lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <prqhodx7wc3cbrlh7tqf632b3gpcciwmn5n22qqv7c7rbtsoy3@lsnd7rtdhfmh>
Date: Mon, 5 Jan 2026 11:41:46 +0100
From: Michal Koutný <mkoutny@...e.com>
To: Qi Zheng <qi.zheng@...ux.dev>
Cc: hannes@...xchg.org, hughd@...gle.com, mhocko@...e.com, 
	roman.gushchin@...ux.dev, shakeel.butt@...ux.dev, muchun.song@...ux.dev, david@...nel.org, 
	lorenzo.stoakes@...cle.com, ziy@...dia.com, harry.yoo@...cle.com, imran.f.khan@...cle.com, 
	kamalesh.babulal@...cle.com, axelrasmussen@...gle.com, yuanchu@...gle.com, weixugc@...gle.com, 
	chenridong@...weicloud.com, akpm@...ux-foundation.org, hamzamahfooz@...ux.microsoft.com, 
	apais@...ux.microsoft.com, lance.yang@...ux.dev, linux-mm@...ck.org, 
	linux-kernel@...r.kernel.org, cgroups@...r.kernel.org, Muchun Song <songmuchun@...edance.com>, 
	Qi Zheng <zhengqi.arch@...edance.com>, Yosry Ahmed <yosry.ahmed@...ux.dev>
Subject: Re: [PATCH v2 27/28] mm: memcontrol: eliminate the problem of dying
 memory cgroup for LRU folios

Hi Qi.

On Wed, Dec 17, 2025 at 03:27:51PM +0800, Qi Zheng <qi.zheng@...ux.dev> wrote:

> @@ -5200,22 +5238,27 @@ int __mem_cgroup_try_charge_swap(struct folio *folio, swp_entry_t entry)
>  	unsigned int nr_pages = folio_nr_pages(folio);
>  	struct page_counter *counter;
>  	struct mem_cgroup *memcg;
> +	struct obj_cgroup *objcg;
>  
>  	if (do_memsw_account())
>  		return 0;
>  
> -	memcg = folio_memcg(folio);
> -
> -	VM_WARN_ON_ONCE_FOLIO(!memcg, folio);
> -	if (!memcg)
> +	objcg = folio_objcg(folio);
> +	VM_WARN_ON_ONCE_FOLIO(!objcg, folio);
> +	if (!objcg)
>  		return 0;
>  
> +	rcu_read_lock();
> +	memcg = obj_cgroup_memcg(objcg);
>  	if (!entry.val) {
>  		memcg_memory_event(memcg, MEMCG_SWAP_FAIL);
> +		rcu_read_unlock();
>  		return 0;
>  	}
>  
>  	memcg = mem_cgroup_id_get_online(memcg);
> +	/* memcg is pined by memcg ID. */
> +	rcu_read_unlock();
>  
>  	if (!mem_cgroup_is_root(memcg) &&
>  	    !page_counter_try_charge(&memcg->swap, nr_pages, &counter)) {

Later there is:
	swap_cgroup_record(folio, mem_cgroup_id(memcg), entry);

As per the comment memcg remains pinned by the ID which is associated
with a swap slot, i.e. theoretically time unbound (shmem).
(This was actually brought up by Yosry in stats subthread [1])

I think that should be tackled too to eliminate the problem completely.

As I look at the code, these memcg IDs (private [2]) could be converted
to objcg IDs so that reparenting applies also to folios that are
currently swapped out. (Or convert to swap_cgroup_ctrl from the vector
of IDs to a vector of objcg pointers, depending on space.)

Thanks,
Michal

[1] https://lore.kernel.org/r/ebdhvcwygvnfejai5azhg3sjudsjorwmlcvmzadpkhexoeq3tb@5gj5y2exdhpn
[2] https://lore.kernel.org/r/20251225232116.294540-1-shakeel.butt@linux.dev

Download attachment "signature.asc" of type "application/pgp-signature" (266 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ