[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y5idFucjKVbjatqc@dhcp22.suse.cz>
Date: Tue, 13 Dec 2022 16:41:10 +0100
From: Michal Hocko <mhocko@...e.com>
To: Dave Hansen <dave.hansen@...el.com>,
"Huang, Ying" <ying.huang@...el.com>
Cc: Yang Shi <shy828301@...il.com>, Wei Xu <weixugc@...gle.com>,
Johannes Weiner <hannes@...xchg.org>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
LKML <linux-kernel@...r.kernel.org>
Subject: memcg reclaim demotion wrt. isolation
Hi,
I have just noticed that that pages allocated for demotion targets
includes __GFP_KSWAPD_RECLAIM (through GFP_NOWAIT). This is the case
since the code has been introduced by 26aa2d199d6f ("mm/migrate: demote
pages during reclaim"). I suspect the intention is to trigger the aging
on the fallback node and either drop or further demote oldest pages.
This makes sense but I suspect that this wasn't intended also for
memcg triggered reclaim. This would mean that a memory pressure in one
hierarchy could trigger paging out pages of a different hierarchy if the
demotion target is close to full.
I haven't really checked at the current kswapd wake up checks but I
suspect that kswapd would back off in most cases so this shouldn't
really cause any big problems. But I guess it would be better to simply
not wake kswapd up for the memcg reclaim. What do you think?
---
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 8fcc5fa768c0..1f3161173b85 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1568,7 +1568,7 @@ static struct page *alloc_demote_page(struct page *page, unsigned long private)
* Folios which are not demoted are left on @demote_folios.
*/
static unsigned int demote_folio_list(struct list_head *demote_folios,
- struct pglist_data *pgdat)
+ struct pglist_data *pgdat, bool cgroup_reclaim)
{
int target_nid = next_demotion_node(pgdat->node_id);
unsigned int nr_succeeded;
@@ -1589,6 +1589,10 @@ static unsigned int demote_folio_list(struct list_head *demote_folios,
if (list_empty(demote_folios))
return 0;
+ /* local memcg reclaim shouldn't directly reclaim from other memcgs */
+ if (cgroup_reclaim)
+ mtc->gfp_mask &= ~__GFP_RECLAIM;
+
if (target_nid == NUMA_NO_NODE)
return 0;
@@ -2066,7 +2070,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
/* 'folio_list' is always empty here */
/* Migrate folios selected for demotion */
- nr_reclaimed += demote_folio_list(&demote_folios, pgdat);
+ nr_reclaimed += demote_folio_list(&demote_folios, pgdat, cgroup_reclaim(sc));
/* Folios that could not be demoted are still in @demote_folios */
if (!list_empty(&demote_folios)) {
/* Folios which weren't demoted go back on @folio_list for retry: */
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists