lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aYe1R2MMcXbPVYUW@linux.dev>
Date: Sat, 7 Feb 2026 14:25:44 -0800
From: Shakeel Butt <shakeel.butt@...ux.dev>
To: Qi Zheng <qi.zheng@...ux.dev>
Cc: hannes@...xchg.org, hughd@...gle.com, mhocko@...e.com, 
	roman.gushchin@...ux.dev, muchun.song@...ux.dev, david@...nel.org, 
	lorenzo.stoakes@...cle.com, ziy@...dia.com, harry.yoo@...cle.com, yosry.ahmed@...ux.dev, 
	imran.f.khan@...cle.com, kamalesh.babulal@...cle.com, axelrasmussen@...gle.com, 
	yuanchu@...gle.com, weixugc@...gle.com, chenridong@...weicloud.com, mkoutny@...e.com, 
	akpm@...ux-foundation.org, hamzamahfooz@...ux.microsoft.com, apais@...ux.microsoft.com, 
	lance.yang@...ux.dev, bhe@...hat.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org, 
	cgroups@...r.kernel.org, Muchun Song <songmuchun@...edance.com>, 
	Qi Zheng <zhengqi.arch@...edance.com>
Subject: Re: [PATCH v4 30/31] mm: memcontrol: eliminate the problem of dying
 memory cgroup for LRU folios

On Thu, Feb 05, 2026 at 05:01:49PM +0800, Qi Zheng wrote:
> From: Muchun Song <songmuchun@...edance.com>
> 
> Now that everything is set up, switch folio->memcg_data pointers to
> objcgs, update the accessors, and execute reparenting on cgroup death.
> 
> Finally, folio->memcg_data of LRU folios and kmem folios will always
> point to an object cgroup pointer. The folio->memcg_data of slab
> folios will point to an vector of object cgroups.
> 
> Signed-off-by: Muchun Song <songmuchun@...edance.com>
> Signed-off-by: Qi Zheng <zhengqi.arch@...edance.com>
>  
>  /*
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index e7d4e4ff411b6..0e0efaa511d3d 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -247,11 +247,25 @@ static inline void reparent_state_local(struct mem_cgroup *memcg, struct mem_cgr
>  
>  static inline void reparent_locks(struct mem_cgroup *memcg, struct mem_cgroup *parent)
>  {
> +	int nid, nest = 0;
> +
>  	spin_lock_irq(&objcg_lock);
> +	for_each_node(nid) {
> +		spin_lock_nested(&mem_cgroup_lruvec(memcg,
> +				 NODE_DATA(nid))->lru_lock, nest++);
> +		spin_lock_nested(&mem_cgroup_lruvec(parent,
> +				 NODE_DATA(nid))->lru_lock, nest++);

Is there a reason to acquire locks for all the node together? Why not do
the for_each_node(nid) in memcg_reparent_objcgs() and then reparent the
LRUs for each node one by one and taking and releasing lock
individually. Though the lock for the offlining memcg might not be
contentious but the parent's lock might be if a lot of memory has been
reparented.

> +	}
>  }
>  
>  static inline void reparent_unlocks(struct mem_cgroup *memcg, struct mem_cgroup *parent)
>  {
> +	int nid;
> +
> +	for_each_node(nid) {
> +		spin_unlock(&mem_cgroup_lruvec(parent, NODE_DATA(nid))->lru_lock);
> +		spin_unlock(&mem_cgroup_lruvec(memcg, NODE_DATA(nid))->lru_lock);
> +	}
>  	spin_unlock_irq(&objcg_lock);
>  }
>  
> @@ -260,12 +274,28 @@ static void memcg_reparent_objcgs(struct mem_cgroup *memcg)
>  	struct obj_cgroup *objcg;
>  	struct mem_cgroup *parent = parent_mem_cgroup(memcg);
>  
> +retry:
> +	if (lru_gen_enabled())
> +		max_lru_gen_memcg(parent);
> +
>  	reparent_locks(memcg, parent);
> +	if (lru_gen_enabled()) {
> +		if (!recheck_lru_gen_max_memcg(parent)) {
> +			reparent_unlocks(memcg, parent);
> +			cond_resched();
> +			goto retry;
> +		}
> +		lru_gen_reparent_memcg(memcg, parent);
> +	} else {
> +		lru_reparent_memcg(memcg, parent);
> +	}
>  
>  	objcg = __memcg_reparent_objcgs(memcg, parent);

The above does not need lru locks. With the per-node refactor, it will
be out of lru lock.

>  
>  	reparent_unlocks(memcg, parent);
>  
> +	reparent_state_local(memcg, parent);
> +
>  	percpu_ref_kill(&objcg->refcnt);
>  }
>  
>  

[...]

>  static int charge_memcg(struct folio *folio, struct mem_cgroup *memcg,
>  			gfp_t gfp)
>  {
> -	int ret;
> -
> -	ret = try_charge(memcg, gfp, folio_nr_pages(folio));
> -	if (ret)
> -		goto out;
> +	int ret = 0;
> +	struct obj_cgroup *objcg;
>  
> -	css_get(&memcg->css);
> -	commit_charge(folio, memcg);
> +	objcg = get_obj_cgroup_from_memcg(memcg);
> +	/* Do not account at the root objcg level. */
> +	if (!obj_cgroup_is_root(objcg))
> +		ret = try_charge(memcg, gfp, folio_nr_pages(folio));

Use try_charge_memcg() directly and then this will remove the last user
of try_charge, so remove try_charge completely.

> +	if (ret) {
> +		obj_cgroup_put(objcg);
> +		return ret;
> +	}
> +	commit_charge(folio, objcg);
>  	memcg1_commit_charge(folio, memcg);
> -out:
> +
>  	return ret;
>  }
>  

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ