[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240418124043.GC1055428@cmpxchg.org>
Date: Thu, 18 Apr 2024 08:40:43 -0400
From: Johannes Weiner <hannes@...xchg.org>
To: Christian Heusel <christian@...sel.eu>
Cc: Chengming Zhou <chengming.zhou@...ux.dev>,
Nhat Pham <nphamcs@...il.com>, Seth Jennings <sjenning@...hat.com>,
Dan Streetman <ddstreet@...e.org>,
Vitaly Wool <vitaly.wool@...sulko.com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, David Runge <dave@...epmap.de>,
"Richard W.M. Jones" <rjones@...hat.com>,
Mark W <instruform@...il.com>, regressions@...ts.linux.dev,
Yosry Ahmed <yosryahmed@...gle.com>
Subject: Re: [REGRESSION] Null pointer dereference while shrinking zswap
On Wed, Apr 17, 2024 at 07:18:14PM +0200, Christian Heusel wrote:
> On 24/04/17 10:33AM, Johannes Weiner wrote:
> >
> > Christian, can you please test the below patch on top of current
> > upstream?
> >
>
> Hey Johannes,
>
> I have applied your patch on top of 6.9-rc4 and it did solve the crash for
> me, thanks for hacking together a fix so quickly! 🤗
>
> Tested-By: Christian Heusel <christian@...sel.eu>
Thanks for confirming it, and sorry about the breakage.
Andrew, can you please use the updated changelog below?
---
>From 52f67f5fab6a743c2aedfc8e04a582a9d1025c28 Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@...xchg.org>
Date: Thu, 18 Apr 2024 08:26:28 -0400
Subject: [PATCH] mm: zswap: fix shrinker NULL crash with cgroup_disable=memory
Christian reports a NULL deref in zswap that he bisected down to the
zswap shrinker. The issue also cropped up in the bug trackers of
libguestfs [1] and the Red Hat bugzilla [2].
The problem is that when memcg is disabled with the boot time flag,
the zswap shrinker might get called with sc->memcg == NULL. This is
okay in many places, like the lruvec operations. But it crashes in
memcg_page_state() - which is only used due to the non-node accounting
of cgroup's the zswap memory to begin with.
Nhat spotted that the memcg can be NULL in the memcg-disabled case,
and I was then able to reproduce the crash locally as well.
[1] https://github.com/libguestfs/libguestfs/issues/139
[2] https://bugzilla.redhat.com/show_bug.cgi?id=2275252
Fixes: b5ba474f3f51 ("zswap: shrink zswap pool based on memory pressure")
Cc: stable@...r.kernel.org [v6.8]
Link: https://lkml.kernel.org/r/20240417143324.GA1055428@cmpxchg.org
Reported-by: Christian Heusel <christian@...sel.eu>
Debugged-by: Nhat Pham <nphamcs@...il.com>
Suggested-by: Nhat Pham <nphamcs@...il.com>
Tested-By: Christian Heusel <christian@...sel.eu>
Signed-off-by: Johannes Weiner <hannes@...xchg.org>
---
mm/zswap.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/mm/zswap.c b/mm/zswap.c
index caed028945b0..6f8850c44b61 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1331,15 +1331,22 @@ static unsigned long zswap_shrinker_count(struct shrinker *shrinker,
if (!gfp_has_io_fs(sc->gfp_mask))
return 0;
-#ifdef CONFIG_MEMCG_KMEM
- mem_cgroup_flush_stats(memcg);
- nr_backing = memcg_page_state(memcg, MEMCG_ZSWAP_B) >> PAGE_SHIFT;
- nr_stored = memcg_page_state(memcg, MEMCG_ZSWAPPED);
-#else
- /* use pool stats instead of memcg stats */
- nr_backing = zswap_pool_total_size >> PAGE_SHIFT;
- nr_stored = atomic_read(&zswap_nr_stored);
-#endif
+ /*
+ * For memcg, use the cgroup-wide ZSWAP stats since we don't
+ * have them per-node and thus per-lruvec. Careful if memcg is
+ * runtime-disabled: we can get sc->memcg == NULL, which is ok
+ * for the lruvec, but not for memcg_page_state().
+ *
+ * Without memcg, use the zswap pool-wide metrics.
+ */
+ if (!mem_cgroup_disabled()) {
+ mem_cgroup_flush_stats(memcg);
+ nr_backing = memcg_page_state(memcg, MEMCG_ZSWAP_B) >> PAGE_SHIFT;
+ nr_stored = memcg_page_state(memcg, MEMCG_ZSWAPPED);
+ } else {
+ nr_backing = zswap_pool_total_size >> PAGE_SHIFT;
+ nr_stored = atomic_read(&zswap_nr_stored);
+ }
if (!nr_stored)
return 0;
--
2.44.0
Powered by blists - more mailing lists