linux-kernel - Re: [PATCH v3 11/20] ratelimit: Force re-initialization when rate-limiting re-enabled

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58153904-1d2f-44aa-b97d-56486e91a787@paulmck-laptop>
Date: Mon, 28 Apr 2025 12:49:18 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Petr Mladek <pmladek@...e.com>
Cc: linux-kernel@...r.kernel.org, kernel-team@...a.com,
	Andrew Morton <akpm@...ux-foundation.org>,
	Kuniyuki Iwashima <kuniyu@...zon.com>,
	Mateusz Guzik <mjguzik@...il.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	John Ogness <john.ogness@...utronix.de>,
	Sergey Senozhatsky <senozhatsky@...omium.org>,
	Jon Pan-Doh <pandoh@...gle.com>,
	Bjorn Helgaas <bhelgaas@...gle.com>,
	Karolina Stolarek <karolina.stolarek@...cle.com>
Subject: Re: [PATCH v3 11/20] ratelimit: Force re-initialization when
 rate-limiting re-enabled

On Mon, Apr 28, 2025 at 05:33:38PM +0200, Petr Mladek wrote:
> On Thu 2025-04-24 17:28:17, Paul E. McKenney wrote:
> > Currently, rate limiting being disabled results in an immediate early
> > return with no state changes.  This can result in false-positive drops
> > when re-enabling rate limiting.  Therefore, mark the ratelimit_state
> > structure "uninitialized" when rate limiting is disabled.
> > 
> > Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
> > Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
> > Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
> > Cc: Petr Mladek <pmladek@...e.com>
> > Cc: Andrew Morton <akpm@...ux-foundation.org>
> > Cc: Kuniyuki Iwashima <kuniyu@...zon.com>
> > Cc: Mateusz Guzik <mjguzik@...il.com>
> > Cc: Steven Rostedt <rostedt@...dmis.org>
> > Cc: John Ogness <john.ogness@...utronix.de>
> > Cc: Sergey Senozhatsky <senozhatsky@...omium.org>
> > ---
> >  lib/ratelimit.c | 15 ++++++++++++++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> > 
> > diff --git a/lib/ratelimit.c b/lib/ratelimit.c
> > index 7a7ba4835639f..52aab9229ca50 100644
> > --- a/lib/ratelimit.c
> > +++ b/lib/ratelimit.c
> > @@ -35,11 +35,24 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
> >  	unsigned long flags;
> >  	int ret;
> >  
> > +	/*
> > +	 * Zero interval says never limit, otherwise, non-positive burst
> > +	 * says always limit.
> > +	 */
> >  	if (interval <= 0 || burst <= 0) {
> >  		ret = interval == 0 || burst > 0;
> > +		if (!(READ_ONCE(rs->flags) & RATELIMIT_INITIALIZED) ||
> > +		    !raw_spin_trylock_irqsave(&rs->lock, flags)) {
> 
> I though more about this. And I am not sure if this is safe.
> 
> We are here when the structure has not been initialized yet.
> The spin lock is going to be likely still initialized.
> In theory, it might happen in parallel:
> 
> CPU0				CPU1
> 
> ___ratelimit()
>   if (interval <= 0 || burst <= 0)
>     // true on zero initialized struct
> 
>     if (!(READ_ONCE(rs->flags) & RATELIMIT_INITIALIZED) ||
>       // same here
>       !raw_spin_trylock_irqsave(&rs->lock, flags)) {
>       // success???
> 
> 				ratelimit_state_init()
> 				  raw_spin_lock_init(&rs->lock);
> 
> 
>       raw_spin_unlock_irqrestore(&rs->lock, flags);
> 
> BANG: Unlocked rs->lock which has been initialized in the meantime.
> 
> 
> Note: The non-initialized structure can be used in ext4 code,
>       definitely, see
>       https://lore.kernel.org/r/aAnp9rdPhRY52F7N@pathway.suse.cz
> 
>       Well, I think that it happens in the same CPU. So, it should
>       be "safe".
> 
> 
> Sigh, I guess that this patch is needed to fix the lib/tests/test_ratelimit.c.
> It works because the test modifies .burst and .interval directly.
> 
> What is a sane behavior/API?
> 
> IMHO:
> 
> 1. Nobody, including ext4 code, should use non-initialized
>    struct ratelimit_state. It is a ticking bomb.
> 
>    IMHO, it should trigger a warning and the callers should get fixed.

I agree in principle, but in practice it will at best take some time to
drive the change into the other systems.  So let's make the this code
work with the existing client code, get that accepted, and then separately
I can do another round of cleanup.  (Or you can, if you prefer.)

After that, the WARN_ONCE() can be upgraded to complain if both interval
and burst are zero.

> 2. Nobody, including the selftest, should modify struct ratelimit_state
>    directly.
> 
>    This patchset adds ratelimit_state_reset_interval() for this
>    purpose. It clears ~RATELIMIT_INITIALIZED. So this patch won't
>    be needed when the API was used in the selftest.
> 
> It was actually great that ___ratelimit() was able to handle
> non-initialized struct ratelimit_state without taking
> the bundled lock.
> 
> What do you think, please?

Hmmm...

It sounds like I need to avoid acquiring that lock if both interval
and burst are zero.  A bit inelegant, but it seems to be what needs
to happen.

In the short term, how about the patch shown below?

						Thanx, Paul

------------------------------------------------------------------------

diff --git a/lib/ratelimit.c b/lib/ratelimit.c
index 626f04cabb727..859c251b23ce2 100644
--- a/lib/ratelimit.c
+++ b/lib/ratelimit.c
@@ -42,7 +42,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
 	if (interval <= 0 || burst <= 0) {
 		WARN_ONCE(interval < 0 || burst < 0, "Negative interval (%d) or burst (%d): Uninitialized ratelimit_state structure?\n", interval, burst);
 		ret = interval == 0 || burst > 0;
-		if (!(READ_ONCE(rs->flags) & RATELIMIT_INITIALIZED) ||
+		if (!(READ_ONCE(rs->flags) & RATELIMIT_INITIALIZED) || (!interval && !burst) ||
 		    !raw_spin_trylock_irqsave(&rs->lock, flags))
 			goto nolock_ret;