[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250226211628.GD3949421@google.com>
Date: Wed, 26 Feb 2025 21:16:28 +0000
From: Eric Biggers <ebiggers@...nel.org>
To: Yosry Ahmed <yosry.ahmed@...ux.dev>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Johannes Weiner <hannes@...xchg.org>, Nhat Pham <nphamcs@...il.com>,
Chengming Zhou <chengming.zhou@...ux.dev>,
"David S. Miller" <davem@...emloft.net>,
Herbert Xu <herbert@...dor.apana.org.au>, linux-mm@...ck.org,
linux-crypto@...r.kernel.org, linux-kernel@...r.kernel.org,
syzkaller-bugs@...glegroups.com,
syzbot+1a517ccfcbc6a7ab0f82@...kaller.appspotmail.com,
stable@...r.kernel.org
Subject: Re: [PATCH v2] mm: zswap: fix crypto_free_acomp() deadlock in
zswap_cpu_comp_dead()
On Wed, Feb 26, 2025 at 08:32:22PM +0000, Yosry Ahmed wrote:
> On Wed, Feb 26, 2025 at 08:00:16PM +0000, Eric Biggers wrote:
> > On Wed, Feb 26, 2025 at 06:56:25PM +0000, Yosry Ahmed wrote:
> > > Currently, zswap_cpu_comp_dead() calls crypto_free_acomp() while holding
> > > the per-CPU acomp_ctx mutex. crypto_free_acomp() then holds scomp_lock
> > > (through crypto_exit_scomp_ops_async()).
> > >
> > > On the other hand, crypto_alloc_acomp_node() holds the scomp_lock
> > > (through crypto_scomp_init_tfm()), and then allocates memory.
> > > If the allocation results in reclaim, we may attempt to hold the per-CPU
> > > acomp_ctx mutex.
> >
> > The bug is in acomp. crypto_free_acomp() should never have to wait for a memory
> > allocation. That is what needs to be fixed.
>
> crypto_free_acomp() does not explicitly wait for an allocation, but it
> waits for scomp_lock (in crypto_exit_scomp_ops_async()), which may be
> held while allocating memory from crypto_scomp_init_tfm().
>
> Are you suggesting that crypto_exit_scomp_ops_async() should not be
> holding scomp_lock?
I think the solution while keeping the bounce buffer in place would be to do
what the patch
https://lore.kernel.org/linux-crypto/Z6w7Pz8jBeqhijut@gondor.apana.org.au/ does,
i.e. make the actual allocation and free happen outside the lock.
> > But really the bounce buffering in acomp (which is what is causing this problem)
> > should not exist at all. There is really no practical use case for it; it's
> > just there because of the Crypto API's insistence on shoehorning everything into
> > scatterlists for no reason...
>
> I am assuming this about scomp_scratch logic, which is what we need to
> hold the scomp_lock for, resulting in this problem.
Yes.
> If this is something that can be done right away I am fine with dropping
> this patch for an alternative fix, although it may be nice to reduce the
> lock critical section in zswap_cpu_comp_dead() to the bare minimum
> anyway.
Well, unfortunately the whole Crypto API philosophy of having a single interface
for software and for hardware offload doesn't really work. This is just yet
another example of that; it's a problem caused by shoehorning software
compression into an interface designed for hardware offload. zcomp really
should just use the compression libs directly (like most users of compression in
the kernel already do), and have an alternate code path specifically for
hardware offload (using acomp) for the few people who really want that.
- Eric
Powered by blists - more mailing lists