[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220624184431.GA4386@zipoli.concurrent-rt.com>
Date: Fri, 24 Jun 2022 14:44:31 -0400
From: Joe Korty <joe.korty@...current-rt.com>
To: Mark Gross <markgross@...nel.org>
Cc: "Luis Claudio R. Goncalves" <lgoncalv@...hat.com>,
LKML <linux-kernel@...r.kernel.org>,
Linux RT users <linux-rt-users@...r.kernel.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Thomas Gleixner <tglx@...utronix.de>,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [RT BUG] Mismatched get_uid/free_uid usage in signals in some
rts (2nd try)
[ Fixed incorrect linux-rt-users email address in CC ]
On Fri, Jun 24, 2022 at 09:58:07AM -0700, Mark Gross wrote:
> On Tue, Jun 21, 2022 at 03:16:39PM +0000, Joe Korty wrote:
> > Mismatched get_uid/free_uid usage in signals in 4.9.312-rt193
> >
> > [ First attempt using mutt did not show up on the mailing lists.
> > Trying again with office365 Outlook. Also added the 4.9-rt
> > maintainers. ]
> >
> > The 4.19-rt patch,
> >
> > 0329-signal-Prevent-double-free-of-user-struct.patch
> >
> > needs to be ported to LAG 4.9-rt, as that release now has the Linus commit,
> What does LAG stand for?
Hi Mark,
LAG = Latest and Greatest
> FWIW the cherry-pick within the RT-stable tree worked without conflict.
> (cherry picked from commit a99e09659e6cd4b633c3689f2c3aa5f8a816fe5b)
> It compiles.
> See 58a584ee59b2 signal: Prevent double-free of user struct in
> linux-stable-rt.git/v4.9-rt-next
>
> >
> > fda31c50292a ("signal: avoid double atomic counter increments for user accounting")
> >
> This was added to 4.9.y on March 20, 2020.
> commit 4306259ff6b8b682322d9aeb0c12b27c61c4a548 in linux-stable.
>
> How did you find this issue? What is missing from my testing?
>
> Do you have a test case that I can conferm my cherry-pick works?
> Could you test the v4.9-rt-next branch to see if it fixes you issue?
We do not have a standard test. We were seeing crashes in NFS. It happened
only on arm64 systems. We have a custom kernel with changes and the test
consisted of exercising one of those changes, which involved lots of signals,
then running NFS tests in loopback mode. On occasion NFS would crash in
a way it never has crashed before, which suggested use-after-free corruption.
It never would crash unless we hit signals heavily first, which implied that
something in signals was wrong. After that it wasn't too hard to find the
patch that fixed the problem in 4.4, 4.14, 4.19, 5.4, and 5.10.
We have not seen the NFS crash since applying the fix.
Joe
PS: Correction to the table below. I tested a too-early version of 4.14-rt. Retested.
Current application status:
4.4.302-rt232 OK has both Linus's patch and the fix needed for rt.
4.9.312-rt193 BROKE has Linus's patch but not the fix.
- 4.14.87-rt50 OK does NOT have either Linus's patch nor its rt fix.
+ 4.14.282-rt135 OK has both Linus's patch and the fix needed for rt.
4.19.246-rt110 OK has both Linus's patch and the fix needed for rt.
5.4.193-rt74 OK has both Linus's patch and the fix needed for rt.
5.10.120-rt70 OK has both Linus's patch and the fix needed for rt.
5.15.44-rt46 UNKNOWN no get_uid/free_uid usage in kernel/signal.c anymore.
Powered by blists - more mailing lists