[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b4c933e7-62e2-7018-d848-b5cde0d9ef26@prevas.dk>
Date: Thu, 13 Aug 2020 10:25:45 +0200
From: Rasmus Villemoes <rasmus.villemoes@...vas.dk>
To: Steven Rostedt <rostedt@...dmis.org>, linux-kernel@...r.kernel.org,
linux-rt-users <linux-rt-users@...r.kernel.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Carsten Emde <C.Emde@...dl.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
John Kacur <jkacur@...hat.com>, Daniel Wagner <wagi@...om.org>,
Tom Zanussi <zanussi@...nel.org>,
"Srivatsa S. Bhat" <srivatsa@...il.mit.edu>,
Matt Fleming <matt@...eblueprint.co.uk>
Subject: Re: [PATCH RT 1/6] signal: Prevent double-free of user struct
On 13/08/2020 03.45, Steven Rostedt wrote:
> 5.4.54-rt33-rc1 stable review patch.
> If anyone has any objections, please let me know.
>
No objections, quite the contrary. I think this should also be applied
to 4.19-rt:
Commit fda31c50292a is also in 4.19.y (as 797479da0ae9), since 4.19.112
and hence also 4.19.112-rt47. For a while we've tried to track down a
hang that at least sometimes manifests quite similarly
refcount_t: underflow; use-after-free.
WARNING: CPU: 0 PID: 14 at lib/refcount.c:280 refcount_dec_not_one+0xc0/0xd8
...
Call Trace:
[cf45be10] [c0238258] refcount_dec_not_one+0xc0/0xd8 (unreliable)
[cf45be20] [c02383c8] refcount_dec_and_lock_irqsave+0x20/0xa4
[cf45be40] [c0024a70] free_uid+0x2c/0xa0
[cf45be60] [c00384f0] put_cred_rcu+0x58/0x8c
[cf45be70] [c005f048] rcu_cpu_kthread+0x364/0x49c
[cf45bee0] [c003a0d0] smpboot_thread_fn+0x21c/0x29c
[cf45bf10] [c0036464] kthread+0xe0/0x10c
[cf45bf40] [c000f1cc] ret_from_kernel_thread+0x14/0x1c
But our reproducer is rather complicated and involves cutting power to
neighbouring boards, and takes many minutes to trigger. So I tried
Daniel's reproducer
sigwaittest -t -a -p 98
and almost immediately got a trace much more similar to the one in the
commit message
refcount_t: underflow; use-after-free.
WARNING: CPU: 0 PID: 1526 at lib/refcount.c:280
refcount_dec_not_one+0xc0/0xd8
...
Call Trace:
[cebc9e00] [c0238258] refcount_dec_not_one+0xc0/0xd8 (unreliable)
[cebc9e10] [c02383c8] refcount_dec_and_lock_irqsave+0x20/0xa4
[cebc9e30] [c0024a70] free_uid+0x2c/0xa0
[cebc9e50] [c002574c] dequeue_signal+0x90/0x1a4
[cebc9e80] [c0028f74] sys_rt_sigtimedwait+0x24c/0x288
[cebc9f40] [c000f12c] ret_from_syscall+0x0/0x40
With this patch applied, the sigwaittest has now run for 10 minutes
without problems.
I'll have to run some more tests with our reproducer to see if it really
is the same issue, but even if not, the fact that the sigwaittest fails
should be enough to put this in 4.19-rt.
Three requests (in order of importance):
* pull this into 4.19-rt
* add a note about the sigwaittest reproducer to the commit log
* do publish the rt-rcs in some git repository; that makes it a lot
easier to cherry-pick and test patches
Thanks,
Rasmus
Powered by blists - more mailing lists