linux-kernel - Re: [PATCH] signal: restore the override

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZyVpXtpAn1YKtXQS@google.com>
Date: Fri, 1 Nov 2024 23:50:54 +0000
From: Roman Gushchin <roman.gushchin@...ux.dev>
To: Alexey Gladkov <legion@...nel.org>
Cc: linux-kernel@...r.kernel.org, Andrei Vagin <avagin@...gle.com>,
	Kees Cook <kees@...nel.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>, stable@...r.kernel.org
Subject: Re: [PATCH] signal: restore the override_rlimit logic

On Sat, Nov 02, 2024 at 12:28:38AM +0100, Alexey Gladkov wrote:
> On Thu, Oct 31, 2024 at 08:04:38PM +0000, Roman Gushchin wrote:
> > Prior to commit d64696905554 ("Reimplement RLIMIT_SIGPENDING on top of
> > ucounts") UCOUNT_RLIMIT_SIGPENDING rlimit was not enforced for a class
> > of signals. However now it's enforced unconditionally, even if
> > override_rlimit is set. This behavior change caused production issues.
> > 
> > For example, if the limit is reached and a process receives a SIGSEGV
> > signal, sigqueue_alloc fails to allocate the necessary resources for the
> > signal delivery, preventing the signal from being delivered with
> > siginfo. This prevents the process from correctly identifying the fault
> > address and handling the error. From the user-space perspective,
> > applications are unaware that the limit has been reached and that the
> > siginfo is effectively 'corrupted'. This can lead to unpredictable
> > behavior and crashes, as we observed with java applications.
> > 
> > Fix this by passing override_rlimit into inc_rlimit_get_ucounts() and
> > skip the comparison to max there if override_rlimit is set. This
> > effectively restores the old behavior.
> > 
> > Fixes: d64696905554 ("Reimplement RLIMIT_SIGPENDING on top of ucounts")
> > Signed-off-by: Roman Gushchin <roman.gushchin@...ux.dev>
> > Co-developed-by: Andrei Vagin <avagin@...gle.com>
> > Signed-off-by: Andrei Vagin <avagin@...gle.com>
> > Cc: Kees Cook <kees@...nel.org>
> > Cc: "Eric W. Biederman" <ebiederm@...ssion.com>
> > Cc: Alexey Gladkov <legion@...nel.org>
> > Cc: <stable@...r.kernel.org>
> > ---
> >  include/linux/user_namespace.h | 3 ++-
> >  kernel/signal.c                | 3 ++-
> >  kernel/ucount.c                | 5 +++--
> >  3 files changed, 7 insertions(+), 4 deletions(-)
> > 
> > diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
> > index 3625096d5f85..7183e5aca282 100644
> > --- a/include/linux/user_namespace.h
> > +++ b/include/linux/user_namespace.h
> > @@ -141,7 +141,8 @@ static inline long get_rlimit_value(struct ucounts *ucounts, enum rlimit_type ty
> >  
> >  long inc_rlimit_ucounts(struct ucounts *ucounts, enum rlimit_type type, long v);
> >  bool dec_rlimit_ucounts(struct ucounts *ucounts, enum rlimit_type type, long v);
> > -long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type);
> > +long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type,
> > +			    bool override_rlimit);
> >  void dec_rlimit_put_ucounts(struct ucounts *ucounts, enum rlimit_type type);
> >  bool is_rlimit_overlimit(struct ucounts *ucounts, enum rlimit_type type, unsigned long max);
> >  
> > diff --git a/kernel/signal.c b/kernel/signal.c
> > index 4344860ffcac..cbabb2d05e0a 100644
> > --- a/kernel/signal.c
> > +++ b/kernel/signal.c
> > @@ -419,7 +419,8 @@ __sigqueue_alloc(int sig, struct task_struct *t, gfp_t gfp_flags,
> >  	 */
> >  	rcu_read_lock();
> >  	ucounts = task_ucounts(t);
> > -	sigpending = inc_rlimit_get_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING);
> > +	sigpending = inc_rlimit_get_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING,
> > +					    override_rlimit);
> >  	rcu_read_unlock();
> >  	if (!sigpending)
> >  		return NULL;
> > diff --git a/kernel/ucount.c b/kernel/ucount.c
> > index 16c0ea1cb432..046b3d57ebb4 100644
> > --- a/kernel/ucount.c
> > +++ b/kernel/ucount.c
> > @@ -307,7 +307,8 @@ void dec_rlimit_put_ucounts(struct ucounts *ucounts, enum rlimit_type type)
> >  	do_dec_rlimit_put_ucounts(ucounts, NULL, type);
> >  }
> >  
> > -long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type)
> > +long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type,
> > +			    bool override_rlimit)
> >  {
> >  	/* Caller must hold a reference to ucounts */
> >  	struct ucounts *iter;
> > @@ -316,7 +317,7 @@ long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type)
> >  
> >  	for (iter = ucounts; iter; iter = iter->ns->ucounts) {
> >  		long new = atomic_long_add_return(1, &iter->rlimit[type]);
> > -		if (new < 0 || new > max)
> > +		if (new < 0 || (!override_rlimit && (new > max)))
> >  			goto unwind;
> >  		if (iter == ucounts)
> >  			ret = new;
> 
> It's a bad patch. If we do as you suggest, it will
> do_dec_rlimit_put_ucounts() in case of overflow. This means you'll
> break the counter and there will be an extra decrement in __sigqueue_free().
> We can't just ignore the overflow here.

Hm, I don't think my code is changing anything in terms of the overflow handling.
The (new < 0) handling is exactly the same as it was, the only difference is
that (new > max) is allowed if override_rlimit is set. But new physically
can't be larger than LONG_MAX, so there is no actual change if the limit
is LONG_MAX.

Maybe I'm missing something here, please, clarify.

Thanks!