linux-kernel - RE: [PATCH v1] mm: zswap: Fix a potential memory leak in zswap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <SJ0PR11MB5678C24CDF6AA4FED306FC71C95A2@SJ0PR11MB5678.namprd11.prod.outlook.com>
Date: Wed, 13 Nov 2024 22:13:32 +0000
From: "Sridhar, Kanchana P" <kanchana.p.sridhar@...el.com>
To: Johannes Weiner <hannes@...xchg.org>
CC: Yosry Ahmed <yosryahmed@...gle.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "linux-mm@...ck.org" <linux-mm@...ck.org>,
	"nphamcs@...il.com" <nphamcs@...il.com>, "chengming.zhou@...ux.dev"
	<chengming.zhou@...ux.dev>, "usamaarif642@...il.com"
	<usamaarif642@...il.com>, "ryan.roberts@....com" <ryan.roberts@....com>,
	"Huang, Ying" <ying.huang@...el.com>, "21cnbao@...il.com"
	<21cnbao@...il.com>, "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"Feghali, Wajdi K" <wajdi.k.feghali@...el.com>, "Gopal, Vinodh"
	<vinodh.gopal@...el.com>, "Sridhar, Kanchana P"
	<kanchana.p.sridhar@...el.com>
Subject: RE: [PATCH v1] mm: zswap: Fix a potential memory leak in
 zswap_decompress().


> -----Original Message-----
> From: Johannes Weiner <hannes@...xchg.org>
> Sent: Wednesday, November 13, 2024 1:30 PM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@...el.com>
> Cc: Yosry Ahmed <yosryahmed@...gle.com>; linux-kernel@...r.kernel.org;
> linux-mm@...ck.org; nphamcs@...il.com; chengming.zhou@...ux.dev;
> usamaarif642@...il.com; ryan.roberts@....com; Huang, Ying
> <ying.huang@...el.com>; 21cnbao@...il.com; akpm@...ux-foundation.org;
> Feghali, Wajdi K <wajdi.k.feghali@...el.com>; Gopal, Vinodh
> <vinodh.gopal@...el.com>
> Subject: Re: [PATCH v1] mm: zswap: Fix a potential memory leak in
> zswap_decompress().
> 
> On Wed, Nov 13, 2024 at 07:12:18PM +0000, Sridhar, Kanchana P wrote:
> > I am still thinking moving the mutex_unlock() could help, or at least have
> > no downside. The acomp_ctx is per-cpu and it's mutex_lock/unlock
> > safeguards the interaction between the decompress operation, the
> > sg_*() API calls inside zswap_decompress() and the shared zpool.
> >
> > If we release the per-cpu acomp_ctx's mutex lock before the
> > zpool_unmap_handle(), is it possible that another cpu could acquire
> > it's acomp_ctx's lock and map the same zpool handle (that the earlier
> > cpu has yet to unmap or is concurrently unmapping) for a write?
> > If this could happen, would it result in undefined state for both
> > these zpool ops on different cpu's?
> 
> The code is fine as is.
> 
> Like you said, acomp_ctx->buffer (the pointer) doesn't change. It
> points to whatever was kmalloced in zswap_cpu_comp_prepare(). The
> handle points to backend memory. Neither of those addresses can change
> under us. There is no confusing them, and they cannot coincide.
> 
> The mutex guards the *memory* behind the buffer, so that we don't have
> multiple (de)compressors stepping on each others' toes. But it's fine
> to drop the mutex once we're done working with the memory. We don't
> need the mutex to check whether src holds the acomp buffer address.

Thanks Johannes, for these insights. I was thinking of the following
in zswap_decompress() as creating a non-preemptible context because
of the call to raw_cpu_ptr() at the start; with this context extending
until the mutex_unlock():

	acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx);
	mutex_lock(&acomp_ctx->mutex);

	[...]

	mutex_unlock(&acomp_ctx->mutex);

	if (src != acomp_ctx->buffer)
		zpool_unmap_handle(zpool, entry->handle);

Based on this understanding, I was a bit worried about the
"acomp_ctx->buffer" in the conditional that gates the
zpool_unmap_handle() not being the same acomp_ctx as the one
at the beginning. I may have been confusing myself, since the acomp_ctx
is not re-evaluated before the conditional, just reused from the
start. My apologies to you and Yosry!

> 
> That being said, I do think there is a UAF bug in CPU hotplugging.
> 
> There is an acomp_ctx for each cpu, but note that this is best effort
> parallelism, not a guarantee that we always have the context of the
> local CPU. Look closely: we pick the "local" CPU with preemption
> enabled, then contend for the mutex. This may well put us to sleep and
> get us migrated, so we could be using the context of a CPU we are no
> longer running on. This is fine because we hold the mutex - if that
> other CPU tries to use the acomp_ctx, it'll wait for us.
> 
> However, if we get migrated and vacate the CPU whose context we have
> locked, the CPU might get offlined and zswap_cpu_comp_dead() can free
> the context underneath us. I think we need to refcount the acomp_ctx.

I see. Wouldn't it then seem to make the code more fail-safe to not allow
the migration to happen until after the check for (src != acomp_ctx->buffer), by
moving the mutex_unlock() after this check? Or, use a boolean to determine
if the unmap_handle needs to be done as Yosry suggested?

Thanks,
Kanchana