linux-kernel - Re: [PATCH v2] memcg: allow exiting tasks to write back data to swap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Z4apM9lbuptQBA5Z@tiehlicka>
Date: Tue, 14 Jan 2025 19:13:07 +0100
From: Michal Hocko <mhocko@...e.com>
To: Rik van Riel <riel@...riel.com>
Cc: Johannes Weiner <hannes@...xchg.org>,
	Yosry Ahmed <yosryahmed@...gle.com>,
	Balbir Singh <balbirs@...dia.com>,
	Roman Gushchin <roman.gushchin@...ux.dev>,
	hakeel Butt <shakeel.butt@...ux.dev>,
	Muchun Song <muchun.song@...ux.dev>,
	Andrew Morton <akpm@...ux-foundation.org>, cgroups@...r.kernel.org,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	kernel-team@...a.com, Nhat Pham <nphamcs@...il.com>
Subject: Re: [PATCH v2] memcg: allow exiting tasks to write back data to swap

On Tue 14-01-25 12:11:54, Rik van Riel wrote:
> On Tue, 2025-01-14 at 18:00 +0100, Michal Hocko wrote:
> > On Tue 14-01-25 11:51:18, Rik van Riel wrote:
> > > On Tue, 2025-01-14 at 17:46 +0100, Michal Hocko wrote:
> > > > On Tue 14-01-25 11:09:55, Johannes Weiner wrote:
> > > > 
> > > > > charge_memcg
> > > > > mem_cgroup_swapin_charge_folio
> > > > > __read_swap_cache_async
> > > > > swapin_readahead
> > > > > do_swap_page
> > > > > handle_mm_fault
> > > > > do_user_addr_fault
> > > > > exc_page_fault
> > > > > asm_exc_page_fault
> > > > > __get_user
> > > > 
> > > > All the way here and return the failure to futex_cleanup which
> > > > doesn't
> > > > retry __get_user on the failure AFAICS (exit_robust_list). But I
> > > > might
> > > > be missing something, it's been quite some time since I've looked
> > > > into
> > > > futex code.
> > > 
> > > Can you explain how -ENOMEM would get propagated down
> > > past the page fault handler?
> > > 
> > > This isn't get_user_pages(), which can just pass
> > > -ENOMEM on to the caller.
> > > 
> > > If there is code to pass -ENOMEM on past the page
> > > fault exception handler, I have not been able to
> > > find it. How does this work?
> > 
> > This might be me misunderstading get_user machinery but doesn't it
> > return a failure on PF handler returing ENOMEM?
> 
> I believe __get_user simply does a memcpy, and ends
> up in the page fault handler.

It's been ages since I've looked into that code and my memory might be
very rusty. But IIRC the page fault would be handled through exception
table and return EFAULT on the failure. But I am not really sure whether
that is the case for all errors returned by the page fault handler or
only for SEGV/SIGBUS. I need to refresh my memory on that.

Anyway, have you tried to reproduce with 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 7b3503d12aaf..9c30c442e3b0 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1627,7 +1627,7 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
 	 * A few threads which were not waiting at mutex_lock_killable() can
 	 * fail to bail out. Therefore, check again after holding oom_lock.
 	 */
-	ret = task_is_dying() || out_of_memory(&oc);
+	ret = out_of_memory(&oc);
 
 unlock:
 	mutex_unlock(&oom_lock);

proposed by Johannes earlier? This should help to trigger the oom reaper
to free up some memory.
-- 
Michal Hocko
SUSE Labs