[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6d353178-43f4-4b1b-b28b-f2e6c534a886@huaweicloud.com>
Date: Wed, 27 Aug 2025 19:58:55 +0800
From: Tengda Wu <wutengda@...weicloud.com>
To: Dave Hansen <dave.hansen@...el.com>,
Alexander Potapenko <glider@...gle.com>
Cc: Andrey Ryabinin <ryabinin.a.a@...il.com>,
Thomas Gleixner <tglx@...utronix.de>, Andrey Konovalov
<andreyknvl@...il.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, Dmitry Vyukov
<dvyukov@...gle.com>, Ingo Molnar <mingo@...hat.com>,
linux-kernel@...r.kernel.org, x86@...nel.org
Subject: Re: [PATCH -next] x86: Prevent KASAN false positive warnings in
__show_regs()
On 2025/8/21 11:13, Tengda Wu wrote:
>
>
> On 2025/8/21 5:36, Dave Hansen wrote:
>> On 8/18/25 06:07, Tengda Wu wrote:
>>> When process A accesses process B's `regs` from stack memory through
>>> __show_regs(), the stack of process B keeps changing during runtime.
>>> This causes false positives like "stack out-of-bounds" [1] or
>>> "out-of-bounds" [2] warnings when reading `regs` contents.
>>
>> Could you explain a little bit more how you know that these are false
>> positives?
>
> Thanks for the question. We believe this is a false positive caused by a
> race condition during asynchronous stack tracing of a running process:
>
> Process A (stack trace all processes) Process B (running)
> 1. echo t > /proc/sysrq-trigger
>
> show_trace_log_lvl
> regs = unwind_get_entry_regs()
> show_regs_if_on_stack(regs)
> 2. The stack data pointed by
> `regs` keeps changing, and
> so are the markings in its
> KASAN shadow region.
> __show_regs(regs)
> regs->ax, regs->bx, ...
> 3. hit KASAN redzones, OOB
>
> When process A stacks process B without suspending it, the continuous
> changes in process B's stack (and corresponding KASAN shadow markings)
> may cause process A to hit KASAN redzones when accessing obsolete `regs`
> addresses, resulting in false positive reports.
>
> A sample error log for this scenario is shown below:
>
> [332706.551830] task:cat state:R running task stack:0 pid:3983623 ppid:3977902 flags:0x00004002
> [332706.551847] Call Trace:
> [332706.551853] <TASK>
> [332706.551860] __schedule+0x809/0x1050
> [332706.551873] ? __pfx___schedule+0x10/0x10
> [332706.551885] ? __stack_depot_save+0x34/0x340
> [332706.551899] schedule+0x82/0x160
> [332706.551911] io_schedule+0x68/0xa0
> [332706.551923] __folio_lock_killable+0x1db/0x410
> [332706.551940] ? __pfx___folio_lock_killable+0x10/0x10
> [332706.551955] ? __pfx_wake_page_function+0x10/0x10
> [332706.551969] ? __filemap_get_folio+0x4b/0x3d0
> [332706.551982] filemap_fault+0x67a/0xbd0
> [332706.551996] ? __pfx_filemap_fault+0x10/0x10
> [332706.552008] ? policy_node+0x8a/0xa0
> [332706.552021] ? __mod_node_page_state+0x23/0xf0
> [332706.552035] __do_fault+0x6d/0x340
> [332706.552048] do_cow_fault+0xdd/0x300
> [332706.552061] do_fault+0x141/0x1e0
> [332706.552074] __handle_mm_fault+0x839/0xa70
> [332706.552089] ? __pfx___handle_mm_fault+0x10/0x10
> [332706.552105] ? find_vma+0x6a/0x90
> [332706.552117] handle_mm_fault+0x27d/0x470
> [332706.552132] exc_page_fault+0x336/0x6d0
> [332706.552145] asm_exc_page_fault+0x22/0x30
> [332706.552157] RIP: 0010:rep_stos_alternative+0x40/0x80 --- (1)
> [332706.552173] Code: Unable to access opcode bytes at 0x7ffe929c4fd6. --- (2)
> [332706.552181] RSP: 3fa6:ffff88ba5e554200 EFLAGS: ffffffff9d50c0e0 ORIG_RAX: ffff88ba5e554200 --- (3)
> [332706.552195] ==================================================================
> [332706.552324] BUG: KASAN: out-of-bounds in __show_regs+0x4b/0x340 --- (4)
> [332706.552433] Read of size 8 at addr ffff88d24999fb20 by task sysrq_t_test.sh/3977032
>
> We focus on logs (1) to (4):
> * Log (1) shows the current regs->ip (a kernel-mode address);
> * Log (2) displays regs->ip - PROLOGUE_SIZE, which deviates significantly
> from kernel-mode addresses, indicating regs->ip has changed.
> * Log (3) then reveals anomalous values in regs->{ss,sp,flags}.
> * Finally, Log (4) reports a KASAN OOB error when accessing regs->bx.
>
> Stack tracing a running task cannot guarantee the accuracy of the printed
> regs values, but accessing regs addresses does not cause kernel instability.
> Similar cases have been consistently handled this way in the past. [1][2]
> We therefore consider this a KASAN false positive.
>
> [1] https://lore.kernel.org/all/20220915150417.722975-40-glider@google.com/T/#u
> [2] https://lore.kernel.org/all/5f6e80c4b0c7f7f0b6211900847a247cdaad753c.1479398226.git.jpoimboe@redhat.com/T/#u
Gentle ping.
Any comment or suggestion is appreciated.
Thanks,
Tengda
Powered by blists - more mailing lists