lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 2 Nov 2020 14:11:38 +0900
From:   Masami Hiramatsu <mhiramat@...nel.org>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>, Eddy_Wu@...ndmicro.com,
        x86@...nel.org, davem@...emloft.net, naveen.n.rao@...ux.ibm.com,
        anil.s.keshavamurthy@...el.com, linux-arch@...r.kernel.org,
        cameron@...dycamel.com, oleg@...hat.com, will@...nel.org,
        paulmck@...nel.org
Subject: Re: [PATCH v5 14/21] kprobes: Remove NMI context check

On Fri, 30 Oct 2020 21:38:31 -0400
Steven Rostedt <rostedt@...dmis.org> wrote:

> On Sat, 29 Aug 2020 22:02:36 +0900
> Masami Hiramatsu <mhiramat@...nel.org> wrote:
> 
> > Since the commit 9b38cc704e84 ("kretprobe: Prevent triggering
> > kretprobe from within kprobe_flush_task") sets a dummy current
> > kprobe in the trampoline handler by kprobe_busy_begin/end(),
> > it is not possible to run a kretprobe pre handler in kretprobe
> > trampoline handler context even with the NMI. If the NMI interrupts
> > a kretprobe_trampoline_handler() and it hits a kretprobe, the
> > 2nd kretprobe will detect recursion correctly and it will be
> > skipped.
> > This means we have almost no double-lock issue on kretprobes by NMI.
> > 
> > The last one point is in cleanup_rp_inst() which also takes
> > kretprobe_table_lock without setting up current kprobes.
> > So adding kprobe_busy_begin/end() there allows us to remove
> > in_nmi() check.
> > 
> > The above commit applies kprobe_busy_begin/end() on x86, but
> > now all arch implementation are unified to generic one, we can
> > safely remove the in_nmi() check from arch independent code.
> >
> 
> So are you saying that lockdep is lying?
> 
> Kprobe smoke test: started
> 
> ================================
> WARNING: inconsistent lock state
> 5.10.0-rc1-test+ #29 Not tainted
> --------------------------------
> inconsistent {INITIAL USE} -> {IN-NMI} usage.
> swapper/0/1 [HC1[1]:SC0[0]:HE0:SE1] takes:
> ffffffff82b07118 (&rp->lock){....}-{2:2}, at: pre_handler_kretprobe+0x4b/0x193
> {INITIAL USE} state was registered at:
>   lock_acquire+0x280/0x325
>   _raw_spin_lock+0x30/0x3f
>   recycle_rp_inst+0x3f/0x86
>   __kretprobe_trampoline_handler+0x13a/0x177
>   trampoline_handler+0x48/0x57
>   kretprobe_trampoline+0x2a/0x4f
>   kretprobe_trampoline+0x0/0x4f
>   init_kprobes+0x193/0x19d
>   do_one_initcall+0xf9/0x27e
>   kernel_init_freeable+0x16e/0x2b6
>   kernel_init+0xe/0x109
>   ret_from_fork+0x22/0x30
> irq event stamp: 1670
> hardirqs last  enabled at (1669): [<ffffffff811cc344>] slab_free_freelist_hook+0xb4/0xfd
> hardirqs last disabled at (1670): [<ffffffff81da0887>] exc_int3+0xae/0x10a
> softirqs last  enabled at (1484): [<ffffffff82000352>] __do_softirq+0x352/0x38d
> softirqs last disabled at (1471): [<ffffffff81e00f82>] asm_call_irq_on_stack+0x12/0x20
> 
> other info that might help us debug this:
>  Possible unsafe locking scenario:
> 
>        CPU0
>        ----
>   lock(&rp->lock);
>   <Interrupt>
>     lock(&rp->lock);
> 
>  *** DEADLOCK ***
> 
> no locks held by swapper/0/1.
> 
> stack backtrace:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-rc1-test+ #29
> Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
> Call Trace:
>  dump_stack+0x7d/0x9f
>  print_usage_bug+0x1c0/0x1d3
>  lock_acquire+0x302/0x325
>  ? pre_handler_kretprobe+0x4b/0x193
>  ? stop_machine_from_inactive_cpu+0x120/0x120
>  _raw_spin_lock_irqsave+0x43/0x58
>  ? pre_handler_kretprobe+0x4b/0x193
>  pre_handler_kretprobe+0x4b/0x193
>  ? stop_machine_from_inactive_cpu+0x120/0x120
>  ? kprobe_target+0x1/0x16
>  kprobe_int3_handler+0xd0/0x109
>  exc_int3+0xb8/0x10a
>  asm_exc_int3+0x31/0x40
> RIP: 0010:kprobe_target+0x1/0x16
>  5d c3 cc
> RSP: 0000:ffffc90000033e00 EFLAGS: 00000246
> RAX: ffffffff8110ea77 RBX: 0000000000000001 RCX: ffffc90000033cb4
> RDX: 0000000000000231 RSI: 0000000000000000 RDI: 000000003ca57c35
> RBP: ffffc90000033e20 R08: 0000000000000000 R09: ffffffff8111d207
> R10: ffff8881002ab480 R11: ffff8881002ab480 R12: 0000000000000000
> R13: ffffffff82a52af0 R14: 0000000000000200 R15: ffff888100331130
>  ? register_kprobe+0x43c/0x492
>  ? stop_machine_from_inactive_cpu+0x120/0x120
>  ? kprobe_target+0x1/0x16
>  ? init_test_probes+0x2c6/0x38a
>  init_kprobes+0x193/0x19d
>  ? debugfs_kprobe_init+0xb8/0xb8
>  do_one_initcall+0xf9/0x27e
>  ? rcu_read_lock_sched_held+0x3e/0x75
>  ? init_mm_internals+0x27b/0x284
>  kernel_init_freeable+0x16e/0x2b6
>  ? rest_init+0x152/0x152
>  kernel_init+0xe/0x109
>  ret_from_fork+0x22/0x30
> Kprobe smoke test: passed successfully
> 
> Config attached.

Thanks for the report! Let me check what happen.

Thank you,

> 
> -- Steve


-- 
Masami Hiramatsu <mhiramat@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ