[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <baafcd3cc1abb14cb757fe081fa696012a5265ee.1676068346.git.jpoimboe@kernel.org>
Date:   Fri, 10 Feb 2023 14:42:02 -0800
From:   Josh Poimboeuf <jpoimboe@...nel.org>
To:     x86@...nel.org
Cc:     linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Chen Zhongjin <chenzhongjin@...wei.com>,
        "Naveen N. Rao" <naveen.n.rao@...ux.ibm.com>,
        Anil S Keshavamurthy <anil.s.keshavamurthy@...el.com>,
        "David S. Miller" <davem@...emloft.net>,
        Masami Hiramatsu <mhiramat@...nel.org>
Subject: [PATCH 2/2] x86/entry: Fix unwinding from kprobe on PUSH/POP instruction
If a kprobe (INT3) is set on a stack-modifying single-byte instruction,
like a single-byte PUSH/POP or a LEAVE, ORC fails to unwind past it:
  Call Trace:
   <TASK>
   dump_stack_lvl+0x57/0x90
   handler_pre+0x33/0x40 [kprobe_example]
   aggr_pre_handler+0x49/0x90
   kprobe_int3_handler+0xe3/0x180
   do_int3+0x3a/0x80
   exc_int3+0x7d/0xc0
   asm_exc_int3+0x35/0x40
  RIP: 0010:kernel_clone+0xe/0x3a0
  Code: cc e8 16 b2 bf 00 66 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 57 41 56 41 55 41 54 cc <53> 48 89 fb 48 83 ec 68 4c 8b 27 65 48 8b 04 25 28 00 00 00 48 89
  RSP: 0018:ffffc9000074fda0 EFLAGS: 00000206
  RAX: 0000000000808100 RBX: ffff888109de9d80 RCX: 0000000000000000
  RDX: 0000000000000011 RSI: ffff888109de9d80 RDI: ffffc9000074fdc8
  RBP: ffff8881019543c0 R08: ffffffff81127e30 R09: 00000000e71742a5
  R10: ffff888104764a18 R11: 0000000071742a5e R12: ffff888100078800
  R13: ffff888100126000 R14: 0000000000000000 R15: ffff888100126005
   ? __pfx_call_usermodehelper_exec_async+0x10/0x10
   ? kernel_clone+0xe/0x3a0
   ? user_mode_thread+0x5b/0x80
   ? __pfx_call_usermodehelper_exec_async+0x10/0x10
   ? call_usermodehelper_exec_work+0x77/0xb0
   ? process_one_work+0x299/0x5f0
   ? worker_thread+0x4f/0x3a0
   ? __pfx_worker_thread+0x10/0x10
   ? kthread+0xf2/0x120
   ? __pfx_kthread+0x10/0x10
   ? ret_from_fork+0x29/0x50
   </TASK>
The problem is that #BP saves the pointer to the instruction immediately
*after* the INT3, rather than to the INT3 itself.  The instruction
replaced by the INT3 hasn't actually run, but ORC assumes otherwise and
expects the wrong stack layout.
Fix it by annotating the #BP exception as a non-signal stack frame,
which tells the ORC unwinder to decrement the instruction pointer before
looking up the corresponding ORC entry.
Reported-by: Chen Zhongjin <chenzhongjin@...wei.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@...nel.org>
---
 arch/x86/entry/entry_64.S | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 15739a2c0983..8d21881adf86 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -385,7 +385,14 @@ SYM_CODE_END(xen_error_entry)
  */
 .macro idtentry vector asmsym cfunc has_error_code:req
 SYM_CODE_START(\asmsym)
-	UNWIND_HINT_IRET_REGS offset=\has_error_code*8
+
+	.if \vector == X86_TRAP_BP
+		/* #BP advances %rip to the next instruction */
+		UNWIND_HINT_IRET_REGS offset=\has_error_code*8 signal=0
+	.else
+		UNWIND_HINT_IRET_REGS offset=\has_error_code*8
+	.endif
+
 	ENDBR
 	ASM_CLAC
 	cld
-- 
2.39.1
Powered by blists - more mailing lists
 
