lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201110031955.flxf7iq5yoxjzmsg@treble>
Date:   Mon, 9 Nov 2020 21:19:55 -0600
From:   Josh Poimboeuf <jpoimboe@...hat.com>
To:     Shinichiro Kawasaki <shinichiro.kawasaki@....com>
Cc:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Nicholas Piggin <npiggin@...il.com>,
        Damien Le Moal <Damien.LeMoal@....com>
Subject: Re: WARNING: can't access registers at asm_common_interrupt

On Mon, Nov 09, 2020 at 09:10:38AM +0000, Shinichiro Kawasaki wrote:
> On Nov 06, 2020 / 12:06, Josh Poimboeuf wrote:
> > On Fri, Nov 06, 2020 at 06:04:15AM +0000, Shinichiro Kawasaki wrote:
> > > Greetings,
> > > 
> > > I observe "WARNING: can't access registers at asm_common_interrupt+0x1e/0x40"
> > > in my kernel test system repeatedly, which is printed by unwind_next_frame() in
> > > "arch/x86/kernel/unwind_orc.c". Syzbot already reported that [1]. Similar
> > > warning was reported and discussed [2], but I suppose the cause is not yet
> > > clarified.
> > > 
> > > The warning was observed with v5.10-rc2 and older tags. I bisected and found
> > > that the commit 044d0d6de9f5 ("lockdep: Only trace IRQ edges") in v5.9-rc3
> > > triggered the warning. Reverting that from 5.10-rc2, the warning disappeared.
> > > May I ask comment by expertise on CC how this commit can relate to the warning?
> > > 
> > > The test condition to reproduce the warning is rather unique (blktests,
> > > dm-linear and ZNS device emulation by QEMU). If any action is suggested for
> > > further analysis, I'm willing to take it with my test system.
> > 
> > Hi,
> > 
> > Thanks for reporting this issue.  This might be a different issue from
> > [2].
> > 
> > Can you send me the arch/x86/entry/entry_64.o file from your build?
> 
> Hi Josh, thank you for your response. As a separated e-mail, I have sent the
> entry_64.o only to your address, since I hesitate to send around the 76kb
> attachment file to LKML. In case it does not reach to you, please let me know.

Got it, thanks.  Unfortunately I'm still confused.

Can you test with the following patch, and boot with 'unwind_debug'?
That should hopefully dump a lot of useful data along with the warning.

From: Josh Poimboeuf <jpoimboe@...hat.com>
Subject: [PATCH] x86/unwind/orc: Add 'unwind_debug' cmdline option

Sometimes the one-line ORC unwinder warnings aren't very helpful.  Add a
new 'unwind_debug' cmdline option which will dump the full stack
contents of the current task when an error condition is encountered.

Signed-off-by: Josh Poimboeuf <jpoimboe@...hat.com>
Reviewed-by: Miroslav Benes <mbenes@...e.cz>
---
 .../admin-guide/kernel-parameters.txt         |  6 +++
 arch/x86/kernel/unwind_orc.c                  | 48 ++++++++++++++++++-
 2 files changed, 53 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 526d65d8573a..4bed92c51723 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5512,6 +5512,12 @@
 	unknown_nmi_panic
 			[X86] Cause panic on unknown NMI.
 
+	unwind_debug	[X86-64]
+			Enable unwinder debug output.  This can be
+			useful for debugging certain unwinder error
+			conditions, including corrupt stacks and
+			bad/missing unwinder metadata.
+
 	usbcore.authorized_default=
 			[USB] Default USB device authorization:
 			(default -1 = authorized except for wireless USB,
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 73f800100066..44bae03f9bfc 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -13,8 +13,13 @@
 
 #define orc_warn_current(args...)					\
 ({									\
-	if (state->task == current)					\
+	static bool dumped_before;					\
+	if (state->task == current) {					\
 		orc_warn(args);						\
+		if (unwind_debug && !dumped_before)			\
+			unwind_dump(state);				\
+		dumped_before = true;					\
+	}								\
 })
 
 extern int __start_orc_unwind_ip[];
@@ -23,8 +28,49 @@ extern struct orc_entry __start_orc_unwind[];
 extern struct orc_entry __stop_orc_unwind[];
 
 static bool orc_init __ro_after_init;
+static bool unwind_debug __ro_after_init;
 static unsigned int lookup_num_blocks __ro_after_init;
 
+static int __init unwind_debug_cmdline(char *str)
+{
+	unwind_debug = true;
+
+	return 0;
+}
+early_param("unwind_debug", unwind_debug_cmdline);
+
+static void unwind_dump(struct unwind_state *state)
+{
+	static bool dumped_before;
+	unsigned long word, *sp;
+	struct stack_info stack_info = {0};
+	unsigned long visit_mask = 0;
+
+	if (dumped_before)
+		return;
+
+	dumped_before = true;
+
+	printk_deferred("unwind stack type:%d next_sp:%p mask:0x%lx graph_idx:%d\n",
+			state->stack_info.type, state->stack_info.next_sp,
+			state->stack_mask, state->graph_idx);
+
+	for (sp = __builtin_frame_address(0); sp;
+	     sp = PTR_ALIGN(stack_info.next_sp, sizeof(long))) {
+		if (get_stack_info(sp, state->task, &stack_info, &visit_mask))
+			break;
+
+		for (; sp < stack_info.end; sp++) {
+
+			word = READ_ONCE_NOCHECK(*sp);
+
+			printk_deferred("%0*lx: %0*lx (%pB)\n", BITS_PER_LONG/4,
+					(unsigned long)sp, BITS_PER_LONG/4,
+					word, (void *)word);
+		}
+	}
+}
+
 static inline unsigned long orc_ip(const int *ip)
 {
 	return (unsigned long)ip + *ip;
-- 
2.25.4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ