[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGegRW7qRpPLWJ7Q_j-J1fsKfxvv6+m_8N5q9QVZL6jgUz9cqw@mail.gmail.com>
Date: Wed, 18 Dec 2024 09:38:05 +0100
From: Alessandro Carminati <acarmina@...hat.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Catalin Marinas <catalin.marinas@....com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>, Clark Williams <clrkwllms@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
linux-rt-devel@...ts.linux.dev,
Alessandro Carminati <alessandro.carminati@...il.com>,
Thomas Weissschuh <thomas.weissschuh@...utronix.de>, Juri Lelli <juri.lelli@...hat.com>,
Gabriele Paoloni <gpaoloni@...hat.com>, Eric Chanudet <echanude@...hat.com>, clement.leger@...tlin.com
Subject: Re: [PATCH v2] mm/kmemleak: Fix sleeping function called from invalid
context at print message
Hello Andrew,
Thanks for the feedback.
On Tue, Dec 17, 2024 at 6:55 PM Andrew Morton <akpm@...ux-foundation.org> wrote:
>
> On Tue, 17 Dec 2024 14:20:33 +0000 Alessandro Carminati <acarmina@...hat.com> wrote:
>
> > Address a bug in the kernel that triggers a "sleeping function called from
> > invalid context" warning when /sys/kernel/debug/kmemleak is printed under
> > specific conditions:
> > - CONFIG_PREEMPT_RT=y
> > - Set SELinux as the LSM for the system
> > - Set kptr_restrict to 1
> > - kmemleak buffer contains at least one item
> >
> > BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
> > in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 136, name: cat
>
> -rt is a bit annoying this way. Things which we expect to work OK are
> no longer doing so.
>
> > preempt_count: 1, expected: 0
> > RCU nest depth: 2, expected: 2
> > 6 locks held by cat/136:
> > #0: ffff32e64bcbf950 (&p->lock){+.+.}-{3:3}, at: seq_read_iter+0xb8/0xe30
> > #1: ffffafe6aaa9dea0 (scan_mutex){+.+.}-{3:3}, at: kmemleak_seq_start+0x34/0x128
> > #3: ffff32e6546b1cd0 (&object->lock){....}-{2:2}, at: kmemleak_seq_show+0x3c/0x1e0
> > #4: ffffafe6aa8d8560 (rcu_read_lock){....}-{1:2}, at: has_ns_capability_noaudit+0x8/0x1b0
> > #5: ffffafe6aabbc0f8 (notif_lock){+.+.}-{2:2}, at: avc_compute_av+0xc4/0x3d0
> > irq event stamp: 136660
> > hardirqs last enabled at (136659): [<ffffafe6a80fd7a0>] _raw_spin_unlock_irqrestore+0xa8/0xd8
> > hardirqs last disabled at (136660): [<ffffafe6a80fd85c>] _raw_spin_lock_irqsave+0x8c/0xb0
> > softirqs last enabled at (0): [<ffffafe6a5d50b28>] copy_process+0x11d8/0x3df8
> > softirqs last disabled at (0): [<0000000000000000>] 0x0
> > Preemption disabled at:
> > [<ffffafe6a6598a4c>] kmemleak_seq_show+0x3c/0x1e0
> > CPU: 1 UID: 0 PID: 136 Comm: cat Tainted: G E 6.11.0-rt7+ #34
> > Tainted: [E]=UNSIGNED_MODULE
> > Hardware name: linux,dummy-virt (DT)
> > Call trace:
> > dump_backtrace+0xa0/0x128
> > show_stack+0x1c/0x30
> > dump_stack_lvl+0xe8/0x198
> > dump_stack+0x18/0x20
> > rt_spin_lock+0x8c/0x1a8
> > avc_perm_nonode+0xa0/0x150
> > cred_has_capability.isra.0+0x118/0x218
> > selinux_capable+0x50/0x80
> > security_capable+0x7c/0xd0
> > has_ns_capability_noaudit+0x94/0x1b0
> > has_capability_noaudit+0x20/0x30
> > restricted_pointer+0x21c/0x4b0
> > pointer+0x298/0x760
> > vsnprintf+0x330/0xf70
> > seq_printf+0x178/0x218
> > print_unreferenced+0x1a4/0x2d0
> > kmemleak_seq_show+0xd0/0x1e0
> > seq_read_iter+0x354/0xe30
> > seq_read+0x250/0x378
> > full_proxy_read+0xd8/0x148
> > vfs_read+0x190/0x918
> > ksys_read+0xf0/0x1e0
> > __arm64_sys_read+0x70/0xa8
> > invoke_syscall.constprop.0+0xd4/0x1d8
> > el0_svc+0x50/0x158
> > el0t_64_sync+0x17c/0x180
> >
> > %pS and %pK, in the same back trace line, are redundant, and %pS can void
> > %pK service in certain contexts.
> >
> > %pS alone already provides the necessary information, and if it cannot
> > resolve the symbol, it falls back to printing the raw address voiding
> > the original intent behind the %pK.
> >
> > Additionally, %pK requires a privilege check CAP_SYSLOG enforced through
> > the LSM, which can trigger a "sleeping function called from invalid
> > context" warning under RT_PREEMPT kernels when the check occurs in an
> > atomic context. This issue may also affect other LSMs.
> >
> > This change avoids the unnecessary privilege check and resolves the
> > sleeping function warning without any loss of information.
> >
> > Signed-off-by: Alessandro Carminati <acarmina@...hat.com>
>
> I'm adding
>
> Fixes: 3a6f33d86baa ("mm/kmemleak: use %pK to display kernel pointers in backtrace")
> Cc: <stable@...r.kernel.org>
>
> > --- a/mm/kmemleak.c
> > +++ b/mm/kmemleak.c
> > @@ -373,7 +373,7 @@ static void print_unreferenced(struct seq_file *seq,
> >
> > for (i = 0; i < nr_entries; i++) {
> > void *ptr = (void *)entries[i];
> > - warn_or_seq_printf(seq, " [<%pK>] %pS\n", ptr, ptr);
> > + warn_or_seq_printf(seq, " %pS\n", ptr);
> > }
> > }
>
> Before 3a6f33d86baa we were still printing the address, with plain old
> %p. Should we restore that, or is %p always useless?
>
The intent of 3a6f33d86baa8 is to make pointers available for debugging
purposes.
My argument for this change is that %pS provides already sufficient
information for debugging purposes.
We essentially have two scenarios concerning %pS:
1. When the symbol can be resolved by kallsyms: In this case, a
symbol+offset style pointer is emitted, which I find more useful than
just the address.
2. When the symbol cannot be resolved by kallsyms: Here, the raw pointer
is emitted. This undermines the original intent of %pK, as both the
masked address and the plain address are exposed.
If we want to be picky, there's an additional factor to consider: KASLR.
%pS emits its output accounting for randomization, but in most cases, the
randomized offset information is lost.
The exception is when the symbol cannot be resolved by kallsyms, though
this should be rare. It's still worth mentioning.
Back to your question: %p emits an hashed address that is not
reversible by design.
Therefore, in my opinion, it is not particularly useful in a debug
message.
--
---
172
Powered by blists - more mailing lists