linux-kernel - Re: [PATCH v6] Fix INT1 Recursion with unregistered breakpoints

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAO6TR8W5LQ74caRGwJnPSmoVztngOW-RNK0ex-kB5sdiU5grQg@mail.gmail.com>
Date:	Mon, 11 Jan 2016 10:29:52 -0700
From:	Jeff Merkey <linux.mdb@...il.com>
To:	linux-kernel@...r.kernel.org
Cc:	tglx@...utronix.de, mingo@...hat.com, hpa@...or.com,
	x86@...nel.org, peterz@...radead.org, luto@...nel.org
Subject: Re: [PATCH v6] Fix INT1 Recursion with unregistered breakpoints

On 12/19/15, Jeff Merkey <linux.mdb@...il.com> wrote:
> Please consider the attached patch.
>
> SUMMARY
>
> This patch corrects a hard lockup failure of the system kernel if the
> operating system receives a breakpoint exception at a code execution
> address which was not registered with the operating system.  The patch
> allows kernel debuggers, application profiling and performance modules,
> and external debugging tools to work better together at sharing the
> breakpoint registers on the platform in a way that they do not cause
> errors and system faults, and enables the full feature set in the
> breakpoint API.  If a kernel application triggers a breakpoint
> or programs one in error, this patch will catch the condition and report
> it to the system log without the operating system experiencing a system
> fault.  There are several consumers of the Linux Breakpoint API and all
> of them can and sometimes do cause the condition this patch corrects.
>
> CONDITIONS WHICH RESULT IN THIS SYSTEM FAULT
>
> This system fault can be caused from several sources.  Any kernel code
> can access the debug registers and trigger a breakpoint directly by
> writing to these registers and trigger a hard system hang if no
> breakpoint was registered via arch_install_hw_breakpoint().
>
> kgdb/kdb and the perf event system both present garbage status in dr6
> then subsequently write this status into the thread.debugreg6 variable,
> then in some cases call hw_breakpoint_restore() which writes this
> status back into the dr6 hardware register.
>
> arch/x86/kernel/kgdb.c
> static void kgdb_hw_overflow_handler(struct perf_event *event,
> 		struct perf_sample_data *data, struct pt_regs *regs)
> {
> 	struct task_struct *tsk = current;
> 	int i;
>
> 	for (i = 0; i < 4; i++)
> 		if (breakinfo[i].enabled)
> 			tsk->thread.debugreg6 |= (DR_TRAP0 << i);
> }
>
> arch/x86/kernel/kgdb.c
> static void kgdb_correct_hw_break(void)
> {
>         ... snip ...
>
> 	if (!dbg_is_early)
> 		hw_breakpoint_restore();
>
>         ... snip ...
> }
>
> arch/x86/kernel/hw_breakpoint.c
> void hw_breakpoint_restore(void)
> {
> 	set_debugreg(__this_cpu_read(cpu_debugreg[0]), 0);
> 	set_debugreg(__this_cpu_read(cpu_debugreg[1]), 1);
> 	set_debugreg(__this_cpu_read(cpu_debugreg[2]), 2);
> 	set_debugreg(__this_cpu_read(cpu_debugreg[3]), 3);
> 	set_debugreg(current->thread.debugreg6, 6);
> 	set_debugreg(__this_cpu_read(cpu_dr7), 7);
> }
>
> The hardware only altars those bits that change, the rest of the altered
> dr6 value remains in the register.
>
> Upon the next int1 exception, dr6 presents this manufactured status to
> the int1 handler in hw_breakpoint.c which calls the non-existent
> breakpoint exceptions and any handlers which may have validly
> registered, creating phantom events.  If other subsystems which call
> the perf handlers also have breakpoints registered, this
> manufactured status causes erroneous events to be signaled to the layers
> above.
>
> arch/x86/kernel/hw_breakpoint.c
> static int hw_breakpoint_handler(struct die_args *args)
> {
>         ... snip ...
>
> 	/* Handle all the breakpoints that were triggered */
> 	for (i = 0; i < HBP_NUM; ++i) {
> 		if (likely(!(dr6 & (DR_TRAP0 << i))))
> 			continue;
>
>                 ... snip ...
>
> 		perf_bp_event(bp, args->regs);
>
>                 ... snip ...
> 	}
>
> After a few iterations of this cycling through the system, the
> thread.debugreg6 variable starts to resemble a random number generator
> as far as to which breakpoint just occurred.
>
> The perf handlers cause a different incarnation of this problem and
> create the situation by triggering a stale breakpoint set in dr7 for
> which the perf bp is NULL (not registered) or late and for which there
> is a single code path that fails to set the resume flag and clear the
> int1 exception status.
>
> TESTING AND REVIEW PERFORMED
>
> I have reviewed all the code that touches this patch and have
> determined it will function and support all of the software that
> depends on this handler properly.  I have compiled and tested this
> patch with a test harness that tests the robustness of the linux
> breakpoint API and handlers in the following ways:
>
> 1.  Setting multiple conditional breakpoints through
> arch_install_hw_breakpoint API across four processors to test the rate
> at which the interface can handle breakpoint exceptions
>
> 2.  Setting unregistered breakpoints to test the handlers robustness
> in dealing with error handling conditions and errant or spurious
> hardware conditions and to simulate actual "lazy debug register
> switching" with null bp handlers to test the
> robustness of the handlers.
>
> 3.  Clearing and setting breakpoints across multiple processors then
> triggering concurrent exceptions in both interrupt and process
> contexts.
>
> This patch improves robustness in several ways in the linux kernel:
>
> 1.  Corrects bug in handling unregistered breakpoints.
>
> 2.  Provides hardware check of dr7 to determine source of breakpoint
> if OS cannot ascertain the int1 source from its own state and
> variables.
>
> 3.   Enable "lazy debug register switching" to function
> correctly.
>
> Signed-off-by: Jeff Merkey <linux.mdb@...il.com>
> ---
>  arch/x86/include/uapi/asm/debugreg.h |  1 +
>  arch/x86/kernel/hw_breakpoint.c      | 25 +++++++++++++++++++++++--
>  2 files changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/uapi/asm/debugreg.h
> b/arch/x86/include/uapi/asm/debugreg.h
> index 3c0874d..78fc83c 100644
> --- a/arch/x86/include/uapi/asm/debugreg.h
> +++ b/arch/x86/include/uapi/asm/debugreg.h
> @@ -38,6 +38,7 @@
>  #define DR_RW_EXECUTE (0x0)   /* Settings for the access types to trap on
> */
>  #define DR_RW_WRITE (0x1)
>  #define DR_RW_READ (0x3)
> +#define DR_RW_MASK (0x3) /* mask for breakpoint type field */
>
>  #define DR_LEN_1 (0x0) /* Settings for data length to trap on */
>  #define DR_LEN_2 (0x4)
> diff --git a/arch/x86/kernel/hw_breakpoint.c
> b/arch/x86/kernel/hw_breakpoint.c
> index 50a3fad..d199834 100644
> --- a/arch/x86/kernel/hw_breakpoint.c
> +++ b/arch/x86/kernel/hw_breakpoint.c
> @@ -444,7 +444,7 @@ EXPORT_SYMBOL_GPL(hw_breakpoint_restore);
>  static int hw_breakpoint_handler(struct die_args *args)
>  {
>  	int i, cpu, rc = NOTIFY_STOP;
> -	struct perf_event *bp;
> +	struct perf_event *bp = NULL;
>  	unsigned long dr7, dr6;
>  	unsigned long *dr6_p;
>
> @@ -477,6 +477,14 @@ static int hw_breakpoint_handler(struct die_args
> *args)
>  			continue;
>
>  		/*
> +		 * Check if we got an execute breakpoint, if so
> +		 * set the resume flag to avoid int1 recursion.
> +		 */
> +		if (((dr7 >> ((i * DR_CONTROL_SIZE) + DR_CONTROL_SHIFT))
> +			& DR_RW_MASK) == DR_RW_EXECUTE)
> +			args->regs->flags |= X86_EFLAGS_RF;
> +
> +		/*
>  		 * The counter may be concurrently released but that can only
>  		 * occur from a call_rcu() path. We can then safely fetch
>  		 * the breakpoint, use its callback, touch its counter
> @@ -503,7 +511,8 @@ static int hw_breakpoint_handler(struct die_args *args)
>
>  		/*
>  		 * Set up resume flag to avoid breakpoint recursion when
> -		 * returning back to origin.
> +		 * returning back to origin.  perf_bp_event may
> +		 * change the flags so check twice.
>  		 */
>  		if (bp->hw.info.type == X86_BREAKPOINT_EXECUTE)
>  			args->regs->flags |= X86_EFLAGS_RF;
> @@ -519,6 +528,18 @@ static int hw_breakpoint_handler(struct die_args
> *args)
>  	    (dr6 & (~DR_TRAP_BITS)))
>  		rc = NOTIFY_DONE;
>
> +	/*
> +	 * if we are about to signal to
> +	 * do_debug() to stop further processing
> +	 * and we have not ascertained the source
> +	 * of the breakpoint, log it as spurious.
> +	 */
> +	if (rc == NOTIFY_STOP && !bp) {
> +		printk_ratelimited(KERN_INFO
> +				"INFO: spurious INT1 exception dr6: 0x%lX dr7: 0x%lX\n",
> +				dr6, dr7);
> +	}
> +
>  	set_debugreg(dr7, 7);
>  	put_cpu();
>
> --
> 1.8.3.1
>
>

I have not seen this patch turn up in linux-next or the git pull
fixes.  So are you going to put it in or not?  If not show me the
courtesy of letting me know if I have to maintain this fix locally to
get around this bug.

Thanks

Jeff