linux-kernel - Re: BUG - function tracing with breakpoints

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 29 May 2012 07:37:21 -0400
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Dave Jones <davej@...hat.com>
Cc:	"H. Peter Anvin" <hpa@...ux.intel.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Ingo Molnar <mingo@...hat.com>, Andi Kleen <ak@...ux.intel.com>
Subject: Re: BUG - function tracing with breakpoints

On Fri, 2012-05-25 at 16:51 -0400, Steven Rostedt wrote:
> 
> /me continues his search.

Found it!

The int3 handler uses paranoidzeroentry_ist:

.macro paranoidzeroentry_ist sym do_sym ist
ENTRY(\sym)
	INTR_FRAME
	PARAVIRT_ADJUST_EXCEPTION_FRAME
	pushq_cfi $-1		/* ORIG_RAX: no syscall to restart */
	subq $ORIG_RAX-R15, %rsp
	CFI_ADJUST_CFA_OFFSET ORIG_RAX-R15
	call save_paranoid
	TRACE_IRQS_OFF
	movq %rsp,%rdi		/* pt_regs pointer */
	xorl %esi,%esi		/* no error code */
	subq $EXCEPTION_STKSZ, INIT_TSS_IST(\ist)
	call \do_sym
	addq $EXCEPTION_STKSZ, INIT_TSS_IST(\ist)
	jmp paranoid_exit	/* %ebx: no swapgs flag */
	CFI_ENDPROC
END(\sym)

Which calls paranoid_exit, which does:

paranoid_restore:
	TRACE_IRQS_IRETQ 0
	RESTORE_ALL 8
	jmp irq_return

The problem is with TRACE_IRQS_IRETQ which happens to call into lockdep.
Now we are still using the debug stack here outside that little
subtraction trick of the INIT_TSS_IST.

.macro TRACE_IRQS_IRETQ offset=ARGOFFSET
#ifdef CONFIG_TRACE_IRQFLAGS
	bt   $9,EFLAGS-\offset(%rsp)	/* interrupts off? */
	jnc  1f
	TRACE_IRQS_ON
1:
#endif
.endm


#  define TRACE_IRQS_ON		call trace_hardirqs_on_thunk;

THUNK trace_hardirqs_on_thunk,trace_hardirqs_on_caller,1

which eventually leads to:

trace_hardirqs_on_caller {
	__trace_hardirqs_on_caller(ip) {
		mark_locks_held() {
			mark_lock() {
				save_trace() {
					save_stack_trace()...


Unfortunately, the save_stack_trace() is traced by the function tracer.
Which means that it will hit a breakpoint and jump into the breakpoint
code. But here it will reset the stack and corrupt the current stack,
causing strange hard-to-debug bugs.

There's no reason to function trace stack dumps, and this stops the bug
from triggering when I apply it.

Dave, can you give this a try too?

Thanks!

-- Steve

diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 532d2e0..0026999 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -14,6 +14,10 @@ CFLAGS_REMOVE_paravirt-spinlocks.o = -pg
 CFLAGS_REMOVE_pvclock.o = -pg
 CFLAGS_REMOVE_kvmclock.o = -pg
 CFLAGS_REMOVE_ftrace.o = -pg
+CFLAGS_REMOVE_dumpstack.o = -pg
+CFLAGS_REMOVE_dumpstack_32.o = -pg
+CFLAGS_REMOVE_dumpstack_64.o = -pg
+CFLAGS_REMOVE_stacktrace.o = -pg
 CFLAGS_REMOVE_early_printk.o = -pg
 endif
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/