[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20201102141138.1fa825113742f3bea23bc383@kernel.org>
Date: Mon, 2 Nov 2020 14:11:38 +0900
From: Masami Hiramatsu <mhiramat@...nel.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: linux-kernel@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>, Eddy_Wu@...ndmicro.com,
x86@...nel.org, davem@...emloft.net, naveen.n.rao@...ux.ibm.com,
anil.s.keshavamurthy@...el.com, linux-arch@...r.kernel.org,
cameron@...dycamel.com, oleg@...hat.com, will@...nel.org,
paulmck@...nel.org
Subject: Re: [PATCH v5 14/21] kprobes: Remove NMI context check
On Fri, 30 Oct 2020 21:38:31 -0400
Steven Rostedt <rostedt@...dmis.org> wrote:
> On Sat, 29 Aug 2020 22:02:36 +0900
> Masami Hiramatsu <mhiramat@...nel.org> wrote:
>
> > Since the commit 9b38cc704e84 ("kretprobe: Prevent triggering
> > kretprobe from within kprobe_flush_task") sets a dummy current
> > kprobe in the trampoline handler by kprobe_busy_begin/end(),
> > it is not possible to run a kretprobe pre handler in kretprobe
> > trampoline handler context even with the NMI. If the NMI interrupts
> > a kretprobe_trampoline_handler() and it hits a kretprobe, the
> > 2nd kretprobe will detect recursion correctly and it will be
> > skipped.
> > This means we have almost no double-lock issue on kretprobes by NMI.
> >
> > The last one point is in cleanup_rp_inst() which also takes
> > kretprobe_table_lock without setting up current kprobes.
> > So adding kprobe_busy_begin/end() there allows us to remove
> > in_nmi() check.
> >
> > The above commit applies kprobe_busy_begin/end() on x86, but
> > now all arch implementation are unified to generic one, we can
> > safely remove the in_nmi() check from arch independent code.
> >
>
> So are you saying that lockdep is lying?
>
> Kprobe smoke test: started
>
> ================================
> WARNING: inconsistent lock state
> 5.10.0-rc1-test+ #29 Not tainted
> --------------------------------
> inconsistent {INITIAL USE} -> {IN-NMI} usage.
> swapper/0/1 [HC1[1]:SC0[0]:HE0:SE1] takes:
> ffffffff82b07118 (&rp->lock){....}-{2:2}, at: pre_handler_kretprobe+0x4b/0x193
> {INITIAL USE} state was registered at:
> lock_acquire+0x280/0x325
> _raw_spin_lock+0x30/0x3f
> recycle_rp_inst+0x3f/0x86
> __kretprobe_trampoline_handler+0x13a/0x177
> trampoline_handler+0x48/0x57
> kretprobe_trampoline+0x2a/0x4f
> kretprobe_trampoline+0x0/0x4f
> init_kprobes+0x193/0x19d
> do_one_initcall+0xf9/0x27e
> kernel_init_freeable+0x16e/0x2b6
> kernel_init+0xe/0x109
> ret_from_fork+0x22/0x30
> irq event stamp: 1670
> hardirqs last enabled at (1669): [<ffffffff811cc344>] slab_free_freelist_hook+0xb4/0xfd
> hardirqs last disabled at (1670): [<ffffffff81da0887>] exc_int3+0xae/0x10a
> softirqs last enabled at (1484): [<ffffffff82000352>] __do_softirq+0x352/0x38d
> softirqs last disabled at (1471): [<ffffffff81e00f82>] asm_call_irq_on_stack+0x12/0x20
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&rp->lock);
> <Interrupt>
> lock(&rp->lock);
>
> *** DEADLOCK ***
>
> no locks held by swapper/0/1.
>
> stack backtrace:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-rc1-test+ #29
> Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
> Call Trace:
> dump_stack+0x7d/0x9f
> print_usage_bug+0x1c0/0x1d3
> lock_acquire+0x302/0x325
> ? pre_handler_kretprobe+0x4b/0x193
> ? stop_machine_from_inactive_cpu+0x120/0x120
> _raw_spin_lock_irqsave+0x43/0x58
> ? pre_handler_kretprobe+0x4b/0x193
> pre_handler_kretprobe+0x4b/0x193
> ? stop_machine_from_inactive_cpu+0x120/0x120
> ? kprobe_target+0x1/0x16
> kprobe_int3_handler+0xd0/0x109
> exc_int3+0xb8/0x10a
> asm_exc_int3+0x31/0x40
> RIP: 0010:kprobe_target+0x1/0x16
> 5d c3 cc
> RSP: 0000:ffffc90000033e00 EFLAGS: 00000246
> RAX: ffffffff8110ea77 RBX: 0000000000000001 RCX: ffffc90000033cb4
> RDX: 0000000000000231 RSI: 0000000000000000 RDI: 000000003ca57c35
> RBP: ffffc90000033e20 R08: 0000000000000000 R09: ffffffff8111d207
> R10: ffff8881002ab480 R11: ffff8881002ab480 R12: 0000000000000000
> R13: ffffffff82a52af0 R14: 0000000000000200 R15: ffff888100331130
> ? register_kprobe+0x43c/0x492
> ? stop_machine_from_inactive_cpu+0x120/0x120
> ? kprobe_target+0x1/0x16
> ? init_test_probes+0x2c6/0x38a
> init_kprobes+0x193/0x19d
> ? debugfs_kprobe_init+0xb8/0xb8
> do_one_initcall+0xf9/0x27e
> ? rcu_read_lock_sched_held+0x3e/0x75
> ? init_mm_internals+0x27b/0x284
> kernel_init_freeable+0x16e/0x2b6
> ? rest_init+0x152/0x152
> kernel_init+0xe/0x109
> ret_from_fork+0x22/0x30
> Kprobe smoke test: passed successfully
>
> Config attached.
Thanks for the report! Let me check what happen.
Thank you,
>
> -- Steve
--
Masami Hiramatsu <mhiramat@...nel.org>
Powered by blists - more mailing lists