linux-kernel - Re: [PATCH 2/3] x86/traps: Print non-canonical address on #GP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191114174630.GF24045@linux.intel.com>
Date:   Thu, 14 Nov 2019 09:46:30 -0800
From:   Sean Christopherson <sean.j.christopherson@...el.com>
To:     Jann Horn <jannh@...gle.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
        Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Alexander Potapenko <glider@...gle.com>,
        Dmitry Vyukov <dvyukov@...gle.com>, kasan-dev@...glegroups.com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/3] x86/traps: Print non-canonical address on #GP

On Tue, Nov 12, 2019 at 10:10:01PM +0100, Jann Horn wrote:
> A frequent cause of #GP exceptions are memory accesses to non-canonical
> addresses. Unlike #PF, #GP doesn't come with a fault address in CR2, so
> the kernel doesn't currently print the fault address for #GP.
> Luckily, we already have the necessary infrastructure for decoding X86
> instructions and computing the memory address that is being accessed;
> hook it up to the #GP handler so that we can figure out whether the #GP
> looks like it was caused by a non-canonical address, and if so, print
> that address.
> 
> While it is already possible to compute the faulting address manually by
> disassembling the opcode dump and evaluating the instruction against the
> register dump, this should make it slightly easier to identify crashes
> at a glance.
> 
> Signed-off-by: Jann Horn <jannh@...gle.com>
> ---
>  arch/x86/kernel/traps.c | 45 +++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 43 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index c90312146da0..479cfc6e9507 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -56,6 +56,8 @@
>  #include <asm/mpx.h>
>  #include <asm/vm86.h>
>  #include <asm/umip.h>
> +#include <asm/insn.h>
> +#include <asm/insn-eval.h>
>  
>  #ifdef CONFIG_X86_64
>  #include <asm/x86_init.h>
> @@ -509,6 +511,42 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
>  	do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs, error_code, 0, NULL);
>  }
>  
> +/*
> + * On 64-bit, if an uncaught #GP occurs while dereferencing a non-canonical
> + * address, print that address.
> + */
> +static void print_kernel_gp_address(struct pt_regs *regs)
> +{
> +#ifdef CONFIG_X86_64
> +	u8 insn_bytes[MAX_INSN_SIZE];
> +	struct insn insn;
> +	unsigned long addr_ref;
> +
> +	if (probe_kernel_read(insn_bytes, (void *)regs->ip, MAX_INSN_SIZE))
> +		return;
> +
> +	kernel_insn_init(&insn, insn_bytes, MAX_INSN_SIZE);
> +	insn_get_modrm(&insn);
> +	insn_get_sib(&insn);
> +	addr_ref = (unsigned long)insn_get_addr_ref(&insn, regs);
> +
> +	/*
> +	 * If insn_get_addr_ref() failed or we got a canonical address in the
> +	 * kernel half, bail out.
> +	 */
> +	if ((addr_ref | __VIRTUAL_MASK) == ~0UL)
> +		return;
> +	/*
> +	 * For the user half, check against TASK_SIZE_MAX; this way, if the
> +	 * access crosses the canonical address boundary, we don't miss it.
> +	 */
> +	if (addr_ref <= TASK_SIZE_MAX)

Any objection to open coding the upper bound instead of using
TASK_SIZE_MASK to make the threshold more obvious?

> +		return;
> +
> +	pr_alert("dereferencing non-canonical address 0x%016lx\n", addr_ref);

Printing the raw address will confuse users in the case where the access
straddles the lower canonical boundary.  Maybe combine this with open
coding the straddle case?  With a rough heuristic to hedge a bit for
instructions whose operand size isn't accurately reflected in opnd_bytes.

	if (addr_ref > __VIRTUAL_MASK)
		pr_alert("dereferencing non-canonical address 0x%016lx\n", addr_ref);
	else if ((addr_ref + insn->opnd_bytes - 1) > __VIRTUAL_MASK)
		pr_alert("straddling non-canonical boundary 0x%016lx - 0x%016lx\n",
			 addr_ref, addr_ref + insn->opnd_bytes - 1);
	else if ((addr_ref + PAGE_SIZE - 1) > __VIRTUAL_MASK)
		pr_alert("potentially straddling non-canonical boundary 0x%016lx - 0x%016lx\n",
			 addr_ref, addr_ref + PAGE_SIZE - 1);

> +#endif
> +}
> +
>  dotraplinkage void
>  do_general_protection(struct pt_regs *regs, long error_code)
>  {
> @@ -547,8 +585,11 @@ do_general_protection(struct pt_regs *regs, long error_code)
>  			return;
>  
>  		if (notify_die(DIE_GPF, desc, regs, error_code,
> -			       X86_TRAP_GP, SIGSEGV) != NOTIFY_STOP)
> -			die(desc, regs, error_code);
> +			       X86_TRAP_GP, SIGSEGV) == NOTIFY_STOP)
> +			return;
> +
> +		print_kernel_gp_address(regs);

This can be conditional on '!error_code', non-canonical faults on the
direct access always have zero for the error code.  Doubt it will matter
in practice, but far calls and other silly segment instructions can
generate non-zero error codes on #GP in 64-bit mode.

> +		die(desc, regs, error_code);
>  		return;
>  	}
>  
> -- 
> 2.24.0.432.g9d3f5f5b63-goog
>