linux-kernel - Re: [PATCH 6/6] LoongArch: Add generic ex-handler unwind in prologue unwinder

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2bba3021-b4d8-f23e-c924-3e951ce3f768@loongson.cn>
Date:   Fri, 16 Dec 2022 09:44:09 +0800
From:   Jinyang He <hejinyang@...ngson.cn>
To:     Qing Zhang <zhangqing@...ngson.cn>,
        Huacai Chen <chenhuacai@...nel.org>,
        WANG Xuerui <kernel@...0n.name>
Cc:     loongarch@...ts.linux.dev, linux-kernel@...r.kernel.org,
        Steven Rostedt <rostedt@...dmis.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Mark Rutland <mark.rutland@....com>
Subject: Re: [PATCH 6/6] LoongArch: Add generic ex-handler unwind in prologue
 unwinder

On 2022-12-15 20:04, Qing Zhang wrote:

> Hi, Jinyang
>
> On 2022/12/15 下午12:01, Jinyang He wrote:
>> When exception is triggered, code flow go handle_\exception in some
>> cases. One of stackframe in this case as follows,
>>
>> high -> +-------+
>>          | REGS  |  <- a pt_regs
>>          |       |
>>          |       |  <- ex trigger
>>          | REGS  |  <- ex pt_regs   <-+
>>          |       |                    |
>>          |       |                    |
>> low  -> +-------+           ->unwind-+
>>
>> When unwinder unwind to handler_\exception it cannot go on prologue
>> analysis. It is asynchronous code flow, we should get the next frame
>> PC from regs->csr_era but not from regs->regs[1]. And we copy the
>> handler codes to eentry in the early time and copy the handler codes
>> to NUMA-relative memory named pcpu_handlers if NUMA is enabled. Thus,
>> unwinder cannot unwind normally. Therefore, try to give some hint in
>> handler_\exception and fixup it in unwind_next_frame.
>>
>> Reported-by: Qing Zhang <zhangqing@...ngson.cn>
>> Signed-off-by: Jinyang He <hejinyang@...ngson.cn>
>> ---
>>   arch/loongarch/include/asm/unwind.h     |   2 +-
>>   arch/loongarch/kernel/genex.S           |   3 +
>>   arch/loongarch/kernel/unwind_prologue.c | 100 +++++++++++++++++++++---
>>   arch/loongarch/mm/tlb.c                 |   2 +-
>>   4 files changed, 92 insertions(+), 15 deletions(-)
>>
> The others look good to me, but there is still a small problem:
> When I tested hw_breakpoint.ko with prologue unwinder,
> sometimes output address [<9000000100302724>] 0x9000000100302724, eg: 
> CPU: 3 PID: 0
> But some processes are normal, eg: CPU: 0 PID: 0
>
> [27.655549] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 6.1.0-rc8 #9
> [27.655552] Hardware name: Loongson Loongson-3A5000-7A1000-1w-A2101/
> Loongson-LS3A5000-7A1000-1w-A2101,  BIOS 
> vUDK2018-LoongArch-V2.0.pre-beta8 06/15/2022
>
> [27.655604]...
> [27.655606] Call Trace:
> [27.655607] [<9000000000222f88>] show_stack+0x60/0x184
> [27.655613] [<90000000010e9b8c>] dump_stack_lvl+0x60/0x88
> [27.655618] [] sample_hbp_handler+0x30/0x4c [data_breakpoint]
> [27.655626] [<900000000037c8a0>] __perf_event_overflow+0x84/0x26c
> [27.655629] [<900000000038980c>] perf_bp_event+0xc0/0xc8
> [27.655633] [<900000000022e3bc>] watchpoint_handler+0x54/0x88
> [27.655637] [<90000000010ea2f8>] do_watch+0x30/0x48
> [27.655640] [<9000000100302724>] 0x9000000100302724      // Not natural
> [27.655642] [<9000000000ab4cbc>] add_interrupt_randomness+0x60/0xbc
> [27.655646] [<90000000002a0fa0>] handle_irq_event_percpu+0x28/0x70
> [27.655650] [<90000000002a6f9c>] handle_percpu_irq+0x54/0x88
> [27.655652] [<90000000002a025c>] generic_handle_domain_irq+0x28/0x40
> [27.655655] [<9000000000995458>] handle_cpu_irq+0x68/0xa4
> [27.655658] [<90000000010ea8dc>] handle_loongarch_irq+0x34/0x4c
> [27.655661] [<90000000010ea974>] do_vint+0x80/0xb4
> [27.655664] [<90000000002216a0>] __arch_cpu_idle+0x20/0x24
> [27.655667] [<90000000010f8178>] default_idle_call+0x30/0x58
> [27.655670] [<90000000002825cc>] do_idle+0xb4/0x118
> [27.655674] [<900000000028281c>] cpu_startup_entry+0x20/0x24
> [27.655677] [<900000000022b198>] start_secondary+0x9c/0xa4
> [27.655679] [<90000000010eb124>] smpboot_entry+0x60/0x64
>
>
> [27.658940] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.0-rc8 #9
> ...
> [28.229978] Call Trace:
> [28.229979] [<9000000000222f88>] show_stack+0x60/0x184
> [28.237503] [<90000000010e9b8c>] dump_stack_lvl+0x60/0x88
> [28.242866] [] sample_hbp_handler+0x30/0x4c [data_breakpoint]
> [28.250132] [<900000000037c8a0>] __perf_event_overflow+0x84/0x26c
> [28.256186] [<900000000038980c>] perf_bp_event+0xc0/0xc8
> [28.261462] [<900000000022e3bc>] watchpoint_handler+0x54/0x88
> [28.267170] [<90000000010ea2f8>] do_watch+0x30/0x48
> [28.272013] [<90000000017d2724>] exception_handlers+0x2724/0x1000  //...

There is not in kernel text section but in kernel bss section. Because
the boot cpu set csr.eentry to eentry and set others cpus set csr.eentry
to pcpu_handlers[cpu]. All of these eentry are not in orginal position.
So we cannot find its real symbol. But I still give a chance to go on 
and record
PC value when unwind_state_fixup return true in unwind_by_prologue().


Thanks,

Jinyang


> [28.278155] [<9000000000ab4cbc>] add_interrupt_randomness+0x60/0xbc
> [28.284381] [<90000000002a0fa0>] handle_irq_event_percpu+0x28/0x70
> [28.290520] [<90000000002a6f9c>] handle_percpu_irq+0x54/0x88
> [28.296140] [<90000000002a025c>] generic_handle_domain_irq+0x28/0x40
> [28.302452] [<9000000000995458>] handle_cpu_irq+0x68/0xa4
> [28.307813] [<90000000010ea8dc>] handle_loongarch_irq+0x34/0x4c
> [28.313693] [<90000000010ea974>] do_vint+0x80/0xb4
> [28.318450] [<90000000002216a0>] __arch_cpu_idle+0x20/0x24
> [28.323897] [<90000000010f8178>] default_idle_call+0x30/0x58
> [28.329518] [<90000000002825cc>] do_idle+0xb4/0x118
> [28.334361] [<900000000028281c>] cpu_startup_entry+0x20/0x24
> [28.339982] [<90000000010ec0dc>] kernel_init+0x0/0x110
> [28.345085] [<90000000011106f8>] arch_post_acpi_subsys_init+0x0/0x4
>
> Maybe sometimes assembly kallsyms is not recognized, let me think...
>
> Thanks,
> -Qing
>