[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <812029378ad6adb5121ef336505d4627d70c1c69.camel@redhat.com>
Date: Wed, 15 Oct 2025 16:31:06 +0200
From: Gabriele Monaco <gmonaco@...hat.com>
To: "Adrian Huang (Lenovo)" <adrianhuang0701@...il.com>, Steven Rostedt
<rostedt@...dmis.org>
Cc: Masami Hiramatsu <mhiramat@...nel.org>, athieu Desnoyers
<mathieu.desnoyers@...icios.com>, Nam Cao <namcao@...utronix.de>,
linux-trace-kernel@...r.kernel.org, linux-kernel@...r.kernel.org,
ahuang12@...ovo.com
Subject: Re: [PATCH 1/1] rv: Fix the invalid address access when reading
enabled_monitors
On Wed, 2025-10-15 at 22:14 +0800, Adrian Huang (Lenovo) wrote:
> When executing the following command (reproducible 100%), a kernel oops
> occurs:
>
Thanks Adrian for the patch, I believe your fix is equivalent to Nam's [1],
we'll try to get [1] merged for -rc2.
Thanks,
Gabriele
[1] - https://lore.kernel.org/lkml/20251002082235.973099-1-namcao@linutronix.de
> # cat /sys/kernel/debug/tracing/rv/enabled_monitors
>
> [dmesg]
> BUG: unable to handle page fault for address: ffffffffffffffd0
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 3ddf828067 P4D 3ddf829067 PUD 3ddf82b067 PMD 0
> Oops: Oops: 0000 [#1] SMP NOPTI
> CPU: 479 UID: 0 PID: 15237 Comm: cat Kdump: loaded Not tainted 6.18.0-rc1
> #25 PREEMPT(voluntary)
> Hardware name: Lenovo ThinkSystem SR645 V3 MB,Genoa,DDR5,Oahu,1U/SB27B31174,
> BIOS KAE139B-5.70 06/11/2025
> RIP: 0010:enabled_monitors_next+0x41/0x60
> Code: c0 66 a6 bb 75 24 eb 2d 66 66 2e 0f 1f 84 00 00 00 00 00 66 0f 1f 44
> 00 00 48 8b 50 40 48 8d 42 c0 48 81 fa c0 66 a6 bb 74 0b <80> 78 10 00 74 e9
> c3 cc cc cc cc 31 c0 c3 cc cc cc cc 66 66 2e 0f
> RSP: 0018:ff565a8b1e653d38 EFLAGS: 00010203
> RAX: ffffffffffffffc0 RBX: ff41e057260eabb8 RCX: 0000000000000001
> RDX: 0000000000000000 RSI: ffffffffbba66640 RDI: ff41e057260eabb8
> RBP: 0000000000000000 R08: ffffffffbb313406 R09: 0000000000000034
> R10: 0000000000000000 R11: 0000000000000004 R12: ff565a8b1e653dd0
> R13: ff565a8b1e653da8 R14: ffffffffbba66640 R15: 0000000000000000
> FS: 00007f8da78a4740(0000) GS:ff41e076cde6e000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd0 CR3: 00000001b6b26005 CR4: 0000000000f71ef0
> PKRU: 55555554
> Call Trace:
> <TASK>
> seq_read_iter+0x2ed/0x480
> seq_read+0x12e/0x160
> vfs_read+0xc1/0x370
> ? count_memcg_events+0xb6/0x170
> ksys_read+0x6b/0xe0
> do_syscall_64+0x89/0x810
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
> RIP: 0033:0x7f8da79a0321
>
> [Root Cause]
> enabled_monitors_start() calls enabled_monitors_next() and passes the
> address of a struct rv_monitor instead of the address of its embedded
> list_head.
>
> Commit de090d1ccae1 ("rv: Fix wrong type cast in enabled_monitors_next()")
> assumes that the argument p refers to a list_head (the embedded list
> anchor). This leads to the miscalculated address for the corresponding
> struct rv_monitor object.
>
> Here’s what happens in detail:
>
> 1. Address of rv_monitors_list = 0xffffffffbba666c0
> crash> p &rv_monitors_list
> $1 = (struct list_head *) 0xffffffffbba666c0 <rv_monitors_list>
>
> 2. offset of rv_monitor.list = 0x40
> crash> rv_monitor.list -x
> struct rv_monitor {
> [0x40] struct list_head list;
> }
>
> 3. In enabled_monitors_start(): The local variable mon is assigned
> using list_entry(), resulting in:
> mon = &rv_monitors_list - 0x40 = 0xffffffffbba66680
>
> 4. In enabled_monitors_next(): The argument p becomes
> &rv_monitors_list - 0x40. However, container_of() subtracts the
> offset of rv_monitor.list again, making mon equal to
> &rv_monitors_list - 0x80. This double subtraction results in an
> invalid address and triggers the page fault.
>
> Fix the issue by returning the address of the list_head from both
> enabled_monitors_start() and enabled_monitors_next() instead of the
> address of struct rv_monitor.
>
> The following verifications make sure the issue is fixed:
> 1. Without enabling any monitors
> # cat /sys/kernel/debug/tracing/rv/enabled_monitors
> <No output; no kernel oops.>
>
> 2. Enable monitor and reading enabled_monitors: Test #1
> # echo 1 > /sys/kernel/debug/tracing/rv/monitors/wwnr/enable
> # echo 1 > /sys/kernel/debug/tracing/rv/monitors/rtapp/sleep/enable
> # cat /sys/kernel/debug/tracing/rv/enabled_monitors
> wwnr
> rtapp:sleep
>
> 3. Enable monitor and reading enabled_monitors: Test #2
> # echo 0 > /sys/kernel/debug/tracing/rv/monitors/wwnr/enable
> # echo 1 > /sys/kernel/debug/tracing/rv/monitors/rtapp/sleep/enable
> # echo 1 > /sys/kernel/debug/tracing/rv/monitors/rtapp/pagefault/enable
> # echo 1 > /sys/kernel/debug/tracing/rv/monitors/rtapp/enable
> # cat /sys/kernel/debug/tracing/rv/enabled_monitors
> rtapp
> rtapp:sleep
> rtapp:pagefault
>
> Fixes: de090d1ccae1 ("rv: Fix wrong type cast in enabled_monitors_next()")
> Signed-off-by: Adrian Huang (Lenovo) <adrianhuang0701@...il.com>
> ---
> kernel/trace/rv/rv.c | 14 ++++++--------
> 1 file changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/trace/rv/rv.c b/kernel/trace/rv/rv.c
> index 48338520376f..021cc9bc57ab 100644
> --- a/kernel/trace/rv/rv.c
> +++ b/kernel/trace/rv/rv.c
> @@ -501,7 +501,7 @@ static void *enabled_monitors_next(struct seq_file *m,
> void *p, loff_t *pos)
>
> list_for_each_entry_continue(mon, &rv_monitors_list, list) {
> if (mon->enabled)
> - return mon;
> + return &mon->list;
> }
>
> return NULL;
> @@ -509,23 +509,21 @@ static void *enabled_monitors_next(struct seq_file *m,
> void *p, loff_t *pos)
>
> static void *enabled_monitors_start(struct seq_file *m, loff_t *pos)
> {
> - struct rv_monitor *mon;
> + struct list_head *lh = &rv_monitors_list;
> loff_t l;
>
> mutex_lock(&rv_interface_lock);
>
> - if (list_empty(&rv_monitors_list))
> + if (list_empty(lh))
> return NULL;
>
> - mon = list_entry(&rv_monitors_list, struct rv_monitor, list);
> -
> for (l = 0; l <= *pos; ) {
> - mon = enabled_monitors_next(m, mon, &l);
> - if (!mon)
> + lh = enabled_monitors_next(m, lh, &l);
> + if (!lh)
> break;
> }
>
> - return mon;
> + return lh;
> }
>
> /*
Powered by blists - more mailing lists