linux-kernel - Re: [PATCH rfc 6/9] mm: memcg: move cgroup v1 oom handling code into memcontrol-v1.c

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zj4gi-vOxLZi2van@tiehlicka>
Date: Fri, 10 May 2024 15:26:35 +0200
From: Michal Hocko <mhocko@...e.com>
To: Roman Gushchin <roman.gushchin@...ux.dev>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	Muchun Song <muchun.song@...ux.dev>,
	Johannes Weiner <hannes@...xchg.org>,
	Shakeel Butt <shakeel.butt@...ux.dev>,
	Matthew Wilcox <willy@...radead.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH rfc 6/9] mm: memcg: move cgroup v1 oom handling code into
 memcontrol-v1.c

On Wed 08-05-24 20:41:35, Roman Gushchin wrote:
[...]
> @@ -1747,106 +1623,14 @@ static bool mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order)
>  
>  	memcg_memory_event(memcg, MEMCG_OOM);
>  
> -	/*
> -	 * We are in the middle of the charge context here, so we
> -	 * don't want to block when potentially sitting on a callstack
> -	 * that holds all kinds of filesystem and mm locks.
> -	 *
> -	 * cgroup1 allows disabling the OOM killer and waiting for outside
> -	 * handling until the charge can succeed; remember the context and put
> -	 * the task to sleep at the end of the page fault when all locks are
> -	 * released.
> -	 *
> -	 * On the other hand, in-kernel OOM killer allows for an async victim
> -	 * memory reclaim (oom_reaper) and that means that we are not solely
> -	 * relying on the oom victim to make a forward progress and we can
> -	 * invoke the oom killer here.
> -	 *
> -	 * Please note that mem_cgroup_out_of_memory might fail to find a
> -	 * victim and then we have to bail out from the charge path.
> -	 */
> -	if (READ_ONCE(memcg->oom_kill_disable)) {
> -		if (current->in_user_fault) {
> -			css_get(&memcg->css);
> -			current->memcg_in_oom = memcg;
> -			current->memcg_oom_gfp_mask = mask;
> -			current->memcg_oom_order = order;
> -		}
> +	if (!mem_cgroup_v1_oom_prepare(memcg, mask, order, &locked))
>  		return false;
> -	}
> -
> -	mem_cgroup_mark_under_oom(memcg);
> -
> -	locked = mem_cgroup_oom_trylock(memcg);

This really confused me because this looks like the oom locking is
removed for v2 but this is not the case because
mem_cgroup_v1_oom_prepare is not really v1 only code - in other words
this is not going to be just return false for CONFIG_MEMCG_V1=n.

It makes sense to move the userspace oom handling out to the v1 file. I
would keep mem_cgroup_mark_under_oom here. I am not sure about the oom
locking thing because I think we can make it v1 only. For v2 I guess we
can go without this locking as the oom path is already locked and it
implements overkilling prevention (oom_evaluate_task) as it walks all
processes in the oom hierarchy. 

> -
> -	if (locked)
> -		mem_cgroup_oom_notify(memcg);
> -
> -	mem_cgroup_unmark_under_oom(memcg);
>  	ret = mem_cgroup_out_of_memory(memcg, mask, order);
> -
> -	if (locked)
> -		mem_cgroup_oom_unlock(memcg);
> +	mem_cgroup_v1_oom_finish(memcg, &locked);
>  
>  	return ret;
>  }

-- 
Michal Hocko
SUSE Labs