lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6d353178-43f4-4b1b-b28b-f2e6c534a886@huaweicloud.com>
Date: Wed, 27 Aug 2025 19:58:55 +0800
From: Tengda Wu <wutengda@...weicloud.com>
To: Dave Hansen <dave.hansen@...el.com>,
 Alexander Potapenko <glider@...gle.com>
Cc: Andrey Ryabinin <ryabinin.a.a@...il.com>,
 Thomas Gleixner <tglx@...utronix.de>, Andrey Konovalov
 <andreyknvl@...il.com>, Borislav Petkov <bp@...en8.de>,
 Dave Hansen <dave.hansen@...ux.intel.com>, Dmitry Vyukov
 <dvyukov@...gle.com>, Ingo Molnar <mingo@...hat.com>,
 linux-kernel@...r.kernel.org, x86@...nel.org
Subject: Re: [PATCH -next] x86: Prevent KASAN false positive warnings in
 __show_regs()


On 2025/8/21 11:13, Tengda Wu wrote:
> 
> 
> On 2025/8/21 5:36, Dave Hansen wrote:
>> On 8/18/25 06:07, Tengda Wu wrote:
>>> When process A accesses process B's `regs` from stack memory through
>>> __show_regs(), the stack of process B keeps changing during runtime.
>>> This causes false positives like "stack out-of-bounds" [1] or
>>> "out-of-bounds" [2] warnings when reading `regs` contents.
>>
>> Could you explain a little bit more how you know that these are false
>> positives?
> 
> Thanks for the question. We believe this is a false positive caused by a
> race condition during asynchronous stack tracing of a running process:
> 
> Process A (stack trace all processes)         Process B (running)
> 1. echo t > /proc/sysrq-trigger
> 
> show_trace_log_lvl
>   regs = unwind_get_entry_regs()
>   show_regs_if_on_stack(regs)
>                                               2. The stack data pointed by
>                                                  `regs` keeps changing, and
>                                                  so are the markings in its
>                                                  KASAN shadow region.
>     __show_regs(regs)
>       regs->ax, regs->bx, ...
>         3. hit KASAN redzones, OOB
> 
> When process A stacks process B without suspending it, the continuous
> changes in process B's stack (and corresponding KASAN shadow markings)
> may cause process A to hit KASAN redzones when accessing obsolete `regs`
> addresses, resulting in false positive reports.
> 
> A sample error log for this scenario is shown below:
> 
> [332706.551830] task:cat             state:R  running task     stack:0     pid:3983623 ppid:3977902 flags:0x00004002
> [332706.551847] Call Trace:
> [332706.551853]  <TASK>
> [332706.551860]  __schedule+0x809/0x1050
> [332706.551873]  ? __pfx___schedule+0x10/0x10
> [332706.551885]  ? __stack_depot_save+0x34/0x340
> [332706.551899]  schedule+0x82/0x160
> [332706.551911]  io_schedule+0x68/0xa0
> [332706.551923]  __folio_lock_killable+0x1db/0x410
> [332706.551940]  ? __pfx___folio_lock_killable+0x10/0x10
> [332706.551955]  ? __pfx_wake_page_function+0x10/0x10
> [332706.551969]  ? __filemap_get_folio+0x4b/0x3d0
> [332706.551982]  filemap_fault+0x67a/0xbd0
> [332706.551996]  ? __pfx_filemap_fault+0x10/0x10
> [332706.552008]  ? policy_node+0x8a/0xa0
> [332706.552021]  ? __mod_node_page_state+0x23/0xf0
> [332706.552035]  __do_fault+0x6d/0x340
> [332706.552048]  do_cow_fault+0xdd/0x300
> [332706.552061]  do_fault+0x141/0x1e0
> [332706.552074]  __handle_mm_fault+0x839/0xa70
> [332706.552089]  ? __pfx___handle_mm_fault+0x10/0x10
> [332706.552105]  ? find_vma+0x6a/0x90
> [332706.552117]  handle_mm_fault+0x27d/0x470
> [332706.552132]  exc_page_fault+0x336/0x6d0
> [332706.552145]  asm_exc_page_fault+0x22/0x30
> [332706.552157] RIP: 0010:rep_stos_alternative+0x40/0x80                                         --- (1)
> [332706.552173] Code: Unable to access opcode bytes at 0x7ffe929c4fd6.                           --- (2)
> [332706.552181] RSP: 3fa6:ffff88ba5e554200 EFLAGS: ffffffff9d50c0e0 ORIG_RAX: ffff88ba5e554200   --- (3)
> [332706.552195] ==================================================================
> [332706.552324] BUG: KASAN: out-of-bounds in __show_regs+0x4b/0x340                              --- (4)
> [332706.552433] Read of size 8 at addr ffff88d24999fb20 by task sysrq_t_test.sh/3977032
> 
> We focus on logs (1) to (4):
>  * Log (1) shows the current regs->ip (a kernel-mode address);
>  * Log (2) displays regs->ip - PROLOGUE_SIZE, which deviates significantly
>    from kernel-mode addresses, indicating regs->ip has changed.
>  * Log (3) then reveals anomalous values in regs->{ss,sp,flags}.
>  * Finally, Log (4) reports a KASAN OOB error when accessing regs->bx.
> 
> Stack tracing a running task cannot guarantee the accuracy of the printed
> regs values, but accessing regs addresses does not cause kernel instability.
> Similar cases have been consistently handled this way in the past. [1][2]
> We therefore consider this a KASAN false positive.
> 
> [1] https://lore.kernel.org/all/20220915150417.722975-40-glider@google.com/T/#u
> [2] https://lore.kernel.org/all/5f6e80c4b0c7f7f0b6211900847a247cdaad753c.1479398226.git.jpoimboe@redhat.com/T/#u

Gentle ping. 
Any comment or suggestion is appreciated.

Thanks,
Tengda


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ