[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aXlpzqh9K6bqBh4T@gondor.apana.org.au>
Date: Wed, 28 Jan 2026 09:43:42 +0800
From: Herbert Xu <herbert@...dor.apana.org.au>
To: Lianjie Wang <karin0.zst@...il.com>
Cc: Olivia Mackall <olivia@...enic.com>,
David Laight <david.laight.linux@...il.com>,
Jonathan McDowell <noodles@...a.com>, linux-crypto@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] hwrng: core - use RCU for current_rng to fix race
condition
On Tue, Jan 27, 2026 at 10:32:30PM +0900, Lianjie Wang wrote:
>
> It is true that cleanup_rng() can race with hwrng_init(). However,
> currently put_rng() is also called in hwrng_fillfn(), where we want to
> avoid holding rng_mutex to prevent deadlock in hwrng_unregister().
>
> To solve this race, should we introduce a separate lock (e.g.,
> init_mutex) to serialize only hwrng_init() and cleanup_rng()?
>
> Alternatively, I think we could stop the thread also in
> set_current_rng() before switching current_rng, so that each lifetime of
> hwrng_fillfn() thread strictly holds a single RNG instance, avoiding the
> need to call get_current_rng() or put_rng() inside hwrng_fillfn().
Yes I missed the dead-lock.
I suggest that we delay the work in cleanup_rng into a work_struct,
IOW cleanup_work will simply schedule a work_struct to perform the
actual cleanup.
Then the work_struct can take the mutex safely, immediately check
the reference count on the rng, and if it is non-zero it should return
because the rng has been re-used in the mean time.
If it is zero then it can proceed with the clean-up while holding the
mutex, thus preventing any re-use from occurring.
> I removed the NULL check in put_rng() and moved it to rng_current_show()
> in v2, since all the other callers of put_rng() already check for NULL
> before calling put_rng(). I can restore the NULL check in put_rng() in
> v3 if preferred.
Let's keep this patch simpler by keeping things as they are. If
it turns out that moving the NULL check makes the code more readable,
you can do that in a follow-up patch.
> > > @@ -489,8 +502,17 @@ static int hwrng_fillfn(void *unused)
> > > struct hwrng *rng;
> > >
> > > rng = get_current_rng();
> > > - if (IS_ERR(rng) || !rng)
> > > + if (!rng) {
> > > + /* This is only possible within drop_current_rng(),
> > > + * so just wait until we are stopped.
> > > + */
> > > + while (!kthread_should_stop()) {
> > > + set_current_state(TASK_INTERRUPTIBLE);
> > > + schedule();
> > > + }
> > > break;
> > > + }
> > > +
> >
> > Is the schedule necessary? Shouldn't the break just work as it
> > did before?
>
> With the break alone, the task_struct might get freed before
> kthread_stop() is called, which can still cause use-after-free
> sometimes:
Please change the comment above to say that this loop is needed
to keep the task_struct alive for kthread_stop as it isn't obvious.
I wonder why nobody has added a helper for this since this seems
to be a common pattern in other kthread users? For example,
fs/ext4/mmp.c also does this, and they also set the task state
to TASK_RUNNING, perhaps we should do that too?
Thanks,
--
Email: Herbert Xu <herbert@...dor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Powered by blists - more mailing lists