lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250114192322.GB1115056@cmpxchg.org>
Date: Tue, 14 Jan 2025 14:23:22 -0500
From: Johannes Weiner <hannes@...xchg.org>
To: Michal Hocko <mhocko@...e.com>
Cc: Rik van Riel <riel@...riel.com>, Yosry Ahmed <yosryahmed@...gle.com>,
	Balbir Singh <balbirs@...dia.com>,
	Roman Gushchin <roman.gushchin@...ux.dev>,
	hakeel Butt <shakeel.butt@...ux.dev>,
	Muchun Song <muchun.song@...ux.dev>,
	Andrew Morton <akpm@...ux-foundation.org>, cgroups@...r.kernel.org,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	kernel-team@...a.com, Nhat Pham <nphamcs@...il.com>
Subject: Re: [PATCH v2] memcg: allow exiting tasks to write back data to swap

On Tue, Jan 14, 2025 at 07:13:07PM +0100, Michal Hocko wrote:
> Anyway, have you tried to reproduce with 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 7b3503d12aaf..9c30c442e3b0 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1627,7 +1627,7 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  	 * A few threads which were not waiting at mutex_lock_killable() can
>  	 * fail to bail out. Therefore, check again after holding oom_lock.
>  	 */
> -	ret = task_is_dying() || out_of_memory(&oc);
> +	ret = out_of_memory(&oc);
>  
>  unlock:
>  	mutex_unlock(&oom_lock);
> 
> proposed by Johannes earlier? This should help to trigger the oom reaper
> to free up some memory.

Yes, I was wondering about that too.

If the OOM reaper can be our reliable way of forward progress, we
don't need any reserve or headroom beyond memory.max.

IIRC it can fail if somebody is holding mmap_sem for writing. The exit
path at some point takes that, but also around the time it frees up
all its memory voluntarily, so that should be fine. Are you aware of
other scenarios where it can fail?

What if everything has been swapped out already and there is nothing
to reap? IOW, only unreclaimable/kernel memory remaining in the group.

It still seems to me that allowing the OOM victim (and only the OOM
victim) to bypass memory.max is the only guarantee to progress.

I'm not really concerned about side effects. Any runaway allocation in
the exit path (like the vmalloc one you referenced before) is a much
bigger concern for exceeding the physical OOM reserves in the page
allocator. What's a containment failure for cgroups would be a memory
deadlock at the system level. It's a class of kernel bug that needs
fixing, not something we can really work around in the cgroup code.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ