lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221201085311.249883-1-chenzhongjin@huawei.com>
Date:   Thu, 1 Dec 2022 16:53:11 +0800
From:   Chen Zhongjin <chenzhongjin@...wei.com>
To:     <linux-kernel@...r.kernel.org>
CC:     <chenzhongjin@...wei.com>, <jpoimboe@...nel.org>,
        <peterz@...radead.org>, <tglx@...utronix.de>, <mingo@...hat.com>,
        <bp@...en8.de>, <dave.hansen@...ux.intel.com>, <x86@...nel.org>,
        <hpa@...or.com>
Subject: [PATCH] x86/unwind/orc: Fix unwind ip when kprobes hits push/pop

When unwind stack at asm_exc_int3, the orc type is UNWIND_HINT_TYPE_REGS
and the unwinder will use pt_regs->ip to find next orc, which point to
the probed insn + INT3_INSN_SIZE.

If the probed insn is push/pop, it will point to the next insn which has
different orc state. Before the probed insn has been really excuted in
single step, using next insn ip to find orc will get a wrong unwinding
result.

So, when there is kprobe running and the previous op code is int3,
state->signal should be false so that ip - 1 will be used to find next
orc, until the probed push/pop has been single steped and kprobe set
kprobe_status as KPROBE_HIT_SSDONE.

Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder")
Signed-off-by: Chen Zhongjin <chenzhongjin@...wei.com>
---
dump_stack() in pre_handler probing push:

 Call Trace:
  <TASK>
  dump_stack_lvl+0x79/0x9b
  push_handler_pre+0x1b/0x2e [kp_unwind]
  aggr_pre_handler+0xd8/0x180
  ? opt_pre_handler+0x160/0x160
  ? run_push+0xd/0x61 [kp_unwind]
  kprobe_int3_handler+0x3f3/0x530
  do_int3+0x3b/0x80
  exc_int3+0x2b/0x80
  asm_exc_int3+0x35/0x40
 RIP: 0010:run_push+0xd/0x61 [kp_unwind]
 RSP: 0018:ffff88800650f998 EFLAGS: 00000293
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffc0708060
 RDX: ffff8880124b0000 RSI: 0000000000000000 RDI: ffffed1000ca1f28
 RBP: 0000000000000000 R08: 0000000000000020 R09: ffffed1000ca1ef7
 R10: ffff88800650f7b7 R11: ffffed1000ca1ef6 R12: ffffffffc0668000
 R13: 0000000000000000 R14: fffffbfff80e14f8 R15: 1ffffffff80e14f9
  ? 0xffffffffc0668000
  ? run_push+0xb/0x61 [kp_unwind]
  ? run_push+0xd/0x61 [kp_unwind]
  ? kprobe_init+0x8a/0x1000 [kp_unwind]
  ? do_one_initcall+0xd0/0x4e0
  ? trace_event_raw_event_initcall_level+0x1c0/0x1c0
  ? rcu_read_lock_sched_held+0xa5/0xd0
  ? rcu_read_lock_bh_held+0xc0/0xc0
  ? __kmem_cache_alloc_node+0x1da/0x780
  ? kasan_unpoison+0x23/0x50
  ? 0xffffffffc0668000
  ? do_init_module+0x1cc/0x6a0
  ? load_module+0x5eee/0x7210
  ? ext4_file_read_iter+0x161/0x3a0
  ? module_frob_arch_sections+0x40/0x40
  ? security_file_permission+0x408/0x600
  ? security_kernel_post_read_file+0x93/0xc0
  ? __do_sys_finit_module+0x13c/0x200
  ? __do_sys_finit_module+0x13c/0x200
  ? __ia32_sys_init_module+0xb0/0xb0
  ? rcu_read_lock_bh_held+0xc0/0xc0
  ? rcu_read_lock_bh_held+0xc0/0xc0
  ? syscall_enter_from_user_mode+0x1d/0x50
  ? syscall_enter_from_user_mode+0x1d/0x50
  ? do_syscall_64+0x38/0x90
  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
  </TASK>

After apply this patch, stack trace is correct: 

 Call Trace:
  <TASK>
  dump_stack_lvl+0x79/0x9b
  push_handler_post+0x1b/0x27 [kp_unwind]
  aggr_post_handler+0xdc/0x160
  ? aggr_pre_handler+0x180/0x180
  ? run_push+0xc/0x61 [kp_unwind]
  kprobe_post_process+0x7b/0x210
  kprobe_int3_handler+0x307/0x530
  ? 0xffffffffc0770000
  ? 0xffffffffc0770002
  do_int3+0x3b/0x80
  exc_int3+0x2b/0x80
  asm_exc_int3+0x35/0x40
 RIP: 0010:run_push+0xd/0x61 [kp_unwind]
 RSP: 0018:ffff88801014f9a0 EFLAGS: 00000293
 RAX: 0000000000000293 RBX: 0000000000000000 RCX: ffffffffc0760060
 RDX: ffff88800d1f8000 RSI: 0000000000000000 RDI: ffffed1002029f28
 RBP: 0000000000000000 R08: 0000000000000020 R09: ffffed1002029ef7
 R10: ffff88801014f7b7 R11: ffffed1002029ef6 R12: ffffffffc0768000
 R13: 0000000000000003 R14: fffffbfff80ec4f8 R15: 1ffffffff80ec4f9
  ? 0xffffffffc0768000
  ? run_push+0xb/0x61 [kp_unwind]
  ? 0xffffffffc0770002
  kprobe_init+0x8a/0x1000 [kp_unwind]
  do_one_initcall+0xd0/0x4e0
  ? trace_event_raw_event_initcall_level+0x1c0/0x1c0
  ? rcu_read_lock_sched_held+0xa5/0xd0
  ? rcu_read_lock_bh_held+0xc0/0xc0
  ? __kmem_cache_alloc_node+0x1da/0x780
  ? kasan_unpoison+0x23/0x50
  ? 0xffffffffc0768000
  do_init_module+0x1cc/0x6a0
  load_module+0x5eee/0x7210
  ? ext4_file_read_iter+0x161/0x3a0
  ? module_frob_arch_sections+0x40/0x40
  ? security_file_permission+0x408/0x600
  ? security_kernel_post_read_file+0x93/0xc0
  ? __do_sys_finit_module+0x13c/0x200
  __do_sys_finit_module+0x13c/0x200
  ? __ia32_sys_init_module+0xb0/0xb0
  ? rcu_read_lock_bh_held+0xc0/0xc0
  ? rcu_read_lock_bh_held+0xc0/0xc0
  ? syscall_enter_from_user_mode+0x1d/0x50
  ? syscall_enter_from_user_mode+0x1d/0x50
  do_syscall_64+0x38/0x90
  entry_SYSCALL_64_after_hwframe+0x63/0xcd
 RIP: 0033:0x7fe46ad1b839
---
 arch/x86/kernel/unwind_orc.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index c059820dfaea..81e74b7e7fda 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -2,6 +2,7 @@
 #include <linux/objtool.h>
 #include <linux/module.h>
 #include <linux/sort.h>
+#include <linux/kprobes.h>
 #include <asm/ptrace.h>
 #include <asm/stacktrace.h>
 #include <asm/unwind.h>
@@ -425,6 +426,7 @@ bool unwind_next_frame(struct unwind_state *state)
 	enum stack_type prev_type = state->stack_info.type;
 	struct orc_entry *orc;
 	bool indirect = false;
+	u8 op;
 
 	if (unwind_done(state))
 		return false;
@@ -568,7 +570,21 @@ bool unwind_next_frame(struct unwind_state *state)
 		state->regs = (struct pt_regs *)sp;
 		state->prev_regs = NULL;
 		state->full_regs = true;
-		state->signal = true;
+#ifdef CONFIG_KPROBES
+		/*
+		 * When kprobe replaces push/pop to int3, pt_regs->ip points to
+		 * the next insn which has different orc state.
+		 * Before push/pop is really excuted in single step, signal should
+		 * be set to false so we will use ip - 1 to find correct next orc.
+		 */
+		if (kprobe_running() &&
+			get_kprobe_ctlblk()->kprobe_status != KPROBE_HIT_SSDONE &&
+			!get_kernel_nofault(op, (u8 *)state->ip - 1) &&
+			op == INT3_INSN_OPCODE)
+			state->signal = false;
+		else
+#endif
+			state->signal = true;
 		break;
 
 	case UNWIND_HINT_TYPE_REGS_PARTIAL:
-- 
2.17.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ