linux-kernel - Re: [RFC][PATCH 5/8] x86/mm: fix exception table comments

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1536354280.11460.28.camel@intel.com>
Date:   Fri, 07 Sep 2018 14:04:40 -0700
From:   Sean Christopherson <sean.j.christopherson@...el.com>
To:     Dave Hansen <dave.hansen@...ux.intel.com>,
        linux-kernel@...r.kernel.org
Cc:     peterz@...radead.org, tglx@...utronix.de, x86@...nel.org,
        luto@...nel.org
Subject: Re: [RFC][PATCH 5/8] x86/mm: fix exception table comments

On Fri, 2018-09-07 at 12:49 -0700, Dave Hansen wrote:
> From: Dave Hansen <dave.hansen@...ux.intel.com>
> 
> The comments here are wrong.  They are too absolute about where
> faults can occur when running in the kernel.  The comments are
> also a bit hard to match up with the code.
> 
> Trim down the comments, and make them more precise.
> 
> Also add a comment explaining why we are doing the
> bad_area_nosemaphore() path here.
> 
> Signed-off-by: Dave Hansen <dave.hansen@...ux.intel.com>
> Cc: Sean Christopherson <sean.j.christopherson@...el.com>
> Cc: "Peter Zijlstra (Intel)" <peterz@...radead.org>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Cc: x86@...nel.org
> Cc: Andy Lutomirski <luto@...nel.org>
> ---
> 
>  b/arch/x86/mm/fault.c |   27 ++++++++++++++-------------
>  1 file changed, 14 insertions(+), 13 deletions(-)
> 
> diff -puN arch/x86/mm/fault.c~pkeys-fault-warnings-03 arch/x86/mm/fault.c
> --- a/arch/x86/mm/fault.c~pkeys-fault-warnings-03	2018-09-07 11:21:47.696751898 -0700
> +++ b/arch/x86/mm/fault.c	2018-09-07 11:21:47.700751898 -0700
> @@ -1349,24 +1349,25 @@ void do_user_addr_space_fault(struct pt_
>  		flags |= FAULT_FLAG_INSTRUCTION;
>  
>  	/*
> -	 * When running in the kernel we expect faults to occur only to
> -	 * addresses in user space.  All other faults represent errors in
> -	 * the kernel and should generate an OOPS.  Unfortunately, in the
> -	 * case of an erroneous fault occurring in a code path which already
> -	 * holds mmap_sem we will deadlock attempting to validate the fault
> -	 * against the address space.  Luckily the kernel only validly
> -	 * references user space from well defined areas of code, which are
> -	 * listed in the exceptions table.
> +	 * Kernel-mode access to the user address space should only occur
> +	 * inside well-defined areas of code listed in the exception
> +	 * tables.  But, an erroneous kernel fault occurring outside one of
> +	 * those areas which also holds mmap_sem might deadlock attempting
> +	 * to validate the fault against the address space.
>  	 *
> -	 * As the vast majority of faults will be valid we will only perform
> -	 * the source reference check when there is a possibility of a
> -	 * deadlock. Attempt to lock the address space, if we cannot we then
> -	 * validate the source. If this is invalid we can skip the address
> -	 * space check, thus avoiding the deadlock:
> +	 * Only do the expensive exception table search when we might be at
> +	 * risk of a deadlock:
> +	 * 1. We failed to acquire mmap_sem, and
> +	 * 2. The access was an explicit kernel-mode access
> +	 *    (X86_PF_USER=0).

Might be worth reminding the reader that X86_PF_USER will be set in
sw_error_code for implicit accesses.  I saw "explicit" and my mind
immediately jumped to hw_error_code for whatever reason.  E.g.:

	* 2. The access was an explicit kernel-mode access (we set X86_PF_USER
	*    in sw_error_code for implicit kernel-mode accesses).

>  	 */
>  	if (unlikely(!down_read_trylock(&mm->mmap_sem))) {
>  		if (!(sw_error_code & X86_PF_USER) &&
>  		    !search_exception_tables(regs->ip)) {
> +			/*
> +			 * Fault from code in kernel from
> +			 * which we do not expect faults.
> +			 */
>  			bad_area_nosemaphore(regs, sw_error_code, address, NULL);
>  			return;
>  		}
> _