lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20170729010632.GE84665@google.com>
Date:   Fri, 28 Jul 2017 18:06:32 -0700
From:   Matthias Kaehlcke <mka@...omium.org>
To:     Josh Poimboeuf <jpoimboe@...hat.com>
Cc:     Andrey Ryabinin <ryabinin.a.a@...il.com>,
        Chris J Arges <chris.j.arges@...onical.com>,
        Borislav Petkov <bp@...e.de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H . Peter Anvin" <hpa@...or.com>,
        "x86@...nel.org" <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Douglas Anderson <dianders@...omium.org>,
        Michael Davidson <md@...gle.com>,
        Greg Hackmann <ghackmann@...gle.com>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        Stephen Hines <srhines@...gle.com>,
        Kees Cook <keescook@...omium.org>,
        Arnd Bergmann <arnd@...db.de>,
        Bernhard Rosenkränzer 
        <Bernhard.Rosenkranzer@...aro.org>
Subject: Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in
 get_user() inline asm"

El Fri, Jul 28, 2017 at 07:55:21PM -0500 Josh Poimboeuf ha dit:

> On Fri, Jul 28, 2017 at 05:38:52PM -0700, Matthias Kaehlcke wrote:
> > El Thu, Jul 20, 2017 at 03:56:52PM -0500 Josh Poimboeuf ha dit:
> > 
> > > On Thu, Jul 20, 2017 at 06:30:24PM +0300, Andrey Ryabinin wrote:
> > > > FWIW bellow is my understanding of what's going on.
> > > > 
> > > > It seems clang treats local named register almost the same as ordinary
> > > > local variables.
> > > > The only difference is that before reading the register variable clang
> > > > puts variable's value into the specified register.
> > > > 
> > > > So clang just assigns stack slot for the variable __sp where it's
> > > > going to keep variable's value.
> > > > But since __sp is unitialized (we haven't assign anything to it), the
> > > > value of the __sp is some garbage from stack.
> > > > inline asm specifies __sp as input, so clang assumes that it have to
> > > > load __sp into 'rsp' because inline asm is going to use
> > > > it. And it just loads garbage from stack into 'rsp'
> > > > 
> > > > In fact, such behavior (I mean storing the value on stack and loading
> > > > into reg before the use) is very useful.
> > > > Clang's behavior allows to keep the value assigned to the
> > > > call-clobbered register across the function calls.
> > > > 
> > > > Unlike clang, gcc assigns value to the register right away and doesn't
> > > > store the value anywhere else. So if the reg is
> > > > call clobbered register you have to be absolutely sure that there is
> > > > no subsequent function call that might clobber the register.
> > > > 
> > > > E.g. see some real examples
> > > > https://patchwork.kernel.org/patch/4111971/ or 98d4ded60bda("msm: scm:
> > > > Fix improper register assignment").
> > > > These bugs shouldn't happen with clang.
> > > > 
> > > > But the global named register works slightly differently in clang. For
> > > > the global, the value is just the value of the register itself,
> > > > whatever it is. Read/write from global named register is just like
> > > > direct read/write to the register
> > > 
> > > Thanks, that clears up a lot of the confusion for me.
> > > 
> > > Still, unfortunately, I don't think that's going to work for GCC.
> > > Changing the '__sp' register variable to global in the header file
> > > causes it to make a *bunch* of changes across the kernel, even in
> > > functions which don't do inline asm.  It seems to be disabling some
> > > optimizations across the board.
> > > 
> > > I do have another idea, which is to replace all uses of
> > > 
> > >   asm(" ... call foo ... " : outputs : inputs : clobbers);
> > > 
> > > with a new ASM_CALL macro:
> > > 
> > >   ASM_CALL(" ... call foo ... ", outputs, inputs, clobbers);
> > > 
> > > Then the compiler differences can be abstracted out, with GCC adding
> > > "sp" as an output constraint and clang doing nothing (for now).
> > 
> > The idea sounds interesting, however I see two issues with ASM_CALL():
> > 
> > Inline assembly expects the individual elements of outputs, inputs and
> > clobbers to be comma separated, and so does the macro for it's
> > parameters.
> 
> I think this isn't a problem, see the (untested and unfinished) patch
> below for an idea of how the arguments can be handled.

Good point!

> > The assembler template can refer to the position of output and input
> > operands, adding "sp" as output changes the positions of the inputs
> > wrt to the clang version.
> 
> True, though I think we could convert all ASM_CALL users to use named
> operands instead of the (bug-prone) numbered ones.  It would be an
> improvement regardless.

Agreed, that sounds like a good solution and a desirable improvement.

> > Not sure how to move forward from here. Not even using an ugly #ifdef
> > seems to be a halfway reasonable solution, since get_user() isn't the
> > only place using this construct and #ifdefs would result in highly
> > redundant macros in multiple places.
> 
> I still think ASM_CALL might work.  The below patch is what I came up
> with last week before I got pulled into some other work.  I'll be out on
> vacation next week but I hope to finish the patch after that.

Thanks for working on this.

Enjoy your vacation!

Matthias

> diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
> index 1b020381ab38..5da4bcbeebab 100644
> --- a/arch/x86/include/asm/alternative.h
> +++ b/arch/x86/include/asm/alternative.h
> @@ -216,14 +216,14 @@ static inline int alternatives_text_reserved(void *start, void *end)
>   * Otherwise, old function is used.
>   */
>  #define alternative_call_2(oldfunc, newfunc1, feature1, newfunc2, feature2,   \
> -			   output, input...)				      \
> +			   output, input, clobbers...)			      \
>  {									      \
> -	register void *__sp asm(_ASM_SP);				      \
> -	asm volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1,\
> -		"call %P[new2]", feature2)				      \
> -		: output, "+r" (__sp)					      \
> -		: [old] "i" (oldfunc), [new1] "i" (newfunc1),		      \
> -		  [new2] "i" (newfunc2), ## input);			      \
> +	ASM_CALL(ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1,     \
> +			       "call %P[new2]", feature2),		      \
> +		 ARGS(output),						      \
> +		 ARGS([old] "i" (oldfunc), [new1] "i" (newfunc1),	      \
> +		      [new2] "i" (newfunc2), ARGS(input)),		      \
> +		 ## clobbers);						      \
>  }
>  
>  /*
> diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
> index b4a0d43248cf..21adcc8e739f 100644
> --- a/arch/x86/include/asm/page_64.h
> +++ b/arch/x86/include/asm/page_64.h
> @@ -45,8 +45,8 @@ static inline void clear_page(void *page)
>  			   clear_page_rep, X86_FEATURE_REP_GOOD,
>  			   clear_page_erms, X86_FEATURE_ERMS,
>  			   "=D" (page),
> -			   "0" (page)
> -			   : "memory", "rax", "rcx");
> +			   "0" (page),
> +			   ARGS("memory", "rax", "rcx"));
>  }
>  
>  void copy_page(void *to, void *from);
> diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
> index cb976bab6299..2a585b799a3e 100644
> --- a/arch/x86/include/asm/paravirt_types.h
> +++ b/arch/x86/include/asm/paravirt_types.h
> @@ -471,8 +471,7 @@ int paravirt_disable_iospace(void);
>   */
>  #ifdef CONFIG_X86_32
>  #define PVOP_VCALL_ARGS							\
> -	unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx;	\
> -	register void *__sp asm("esp")
> +	unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx;
>  #define PVOP_CALL_ARGS			PVOP_VCALL_ARGS
>  
>  #define PVOP_CALL_ARG1(x)		"a" ((unsigned long)(x))
> @@ -492,8 +491,7 @@ int paravirt_disable_iospace(void);
>  /* [re]ax isn't an arg, but the return val */
>  #define PVOP_VCALL_ARGS						\
>  	unsigned long __edi = __edi, __esi = __esi,		\
> -		__edx = __edx, __ecx = __ecx, __eax = __eax;	\
> -	register void *__sp asm("rsp")
> +		__edx = __edx, __ecx = __ecx, __eax = __eax;
>  #define PVOP_CALL_ARGS		PVOP_VCALL_ARGS
>  
>  #define PVOP_CALL_ARG1(x)		"D" ((unsigned long)(x))
> @@ -541,24 +539,24 @@ int paravirt_disable_iospace(void);
>  		/* This is 32-bit specific, but is okay in 64-bit */	\
>  		/* since this condition will never hold */		\
>  		if (sizeof(rettype) > sizeof(unsigned long)) {		\
> -			asm volatile(pre				\
> -				     paravirt_alt(PARAVIRT_CALL)	\
> -				     post				\
> -				     : call_clbr, "+r" (__sp)		\
> -				     : paravirt_type(op),		\
> -				       paravirt_clobber(clbr),		\
> -				       ##__VA_ARGS__			\
> -				     : "memory", "cc" extra_clbr);	\
> +			ASM_CALL(pre					\
> +				 paravirt_alt(PARAVIRT_CALL)		\
> +				 post,					\
> +				 ARGS(call_clbr),			\
> +				 ARGS(paravirt_type(op),		\
> +				      paravirt_clobber(clbr),		\
> +				      ##__VA_ARGS__),			\
> +				 ARGS("memory", "cc" extra_clbr));	\
>  			__ret = (rettype)((((u64)__edx) << 32) | __eax); \
>  		} else {						\
> -			asm volatile(pre				\
> -				     paravirt_alt(PARAVIRT_CALL)	\
> -				     post				\
> -				     : call_clbr, "+r" (__sp)		\
> -				     : paravirt_type(op),		\
> -				       paravirt_clobber(clbr),		\
> -				       ##__VA_ARGS__			\
> -				     : "memory", "cc" extra_clbr);	\
> +			ASM_CALL(pre					\
> +				 paravirt_alt(PARAVIRT_CALL)		\
> +				 post,					\
> +				 ARGS(call_clbr),			\
> +				 ARGS(paravirt_type(op),		\
> +				      paravirt_clobber(clbr),		\
> +				      ##__VA_ARGS__),			\
> +				 ARGS("memory", "cc" extra_clbr));	\
>  			__ret = (rettype)(__eax & PVOP_RETMASK(rettype));	\
>  		}							\
>  		__ret;							\
> @@ -578,14 +576,14 @@ int paravirt_disable_iospace(void);
>  	({								\
>  		PVOP_VCALL_ARGS;					\
>  		PVOP_TEST_NULL(op);					\
> -		asm volatile(pre					\
> -			     paravirt_alt(PARAVIRT_CALL)		\
> -			     post					\
> -			     : call_clbr, "+r" (__sp)			\
> -			     : paravirt_type(op),			\
> -			       paravirt_clobber(clbr),			\
> -			       ##__VA_ARGS__				\
> -			     : "memory", "cc" extra_clbr);		\
> +		ASM_CALL(pre						\
> +			 paravirt_alt(PARAVIRT_CALL)			\
> +			 post,						\
> +			 ARGS(call_clbr),				\
> +			 ARGS(paravirt_type(op),			\
> +			      paravirt_clobber(clbr),			\
> +			      ##__VA_ARGS__),				\
> +			 ARGS("memory", "cc" extra_clbr));		\
>  	})
>  
>  #define __PVOP_VCALL(op, pre, post, ...)				\
> diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h
> index ec1f3c651150..19a034cbde37 100644
> --- a/arch/x86/include/asm/preempt.h
> +++ b/arch/x86/include/asm/preempt.h
> @@ -102,16 +102,14 @@ static __always_inline bool should_resched(int preempt_offset)
>    extern asmlinkage void ___preempt_schedule(void);
>  # define __preempt_schedule()					\
>  ({								\
> -	register void *__sp asm(_ASM_SP);			\
> -	asm volatile ("call ___preempt_schedule" : "+r"(__sp));	\
> +	ASM_CALL("call ___preempt_schedule",,);			\
>  })
>  
>    extern asmlinkage void preempt_schedule(void);
>    extern asmlinkage void ___preempt_schedule_notrace(void);
>  # define __preempt_schedule_notrace()					\
>  ({									\
> -	register void *__sp asm(_ASM_SP);				\
> -	asm volatile ("call ___preempt_schedule_notrace" : "+r"(__sp));	\
> +	ASM_CALL("call ___preempt_schedule_notrace",,);			\
>  })
>    extern asmlinkage void preempt_schedule_notrace(void);
>  #endif
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 550bc4ab0df4..2bbfb777c8bb 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -677,20 +677,19 @@ static inline void sync_core(void)
>  	 * Like all of Linux's memory ordering operations, this is a
>  	 * compiler barrier as well.
>  	 */
> -	register void *__sp asm(_ASM_SP);
>  
>  #ifdef CONFIG_X86_32
> -	asm volatile (
> +	ASM_CALL(
>  		"pushfl\n\t"
>  		"pushl %%cs\n\t"
>  		"pushl $1f\n\t"
>  		"iret\n\t"
> -		"1:"
> -		: "+r" (__sp) : : "memory");
> +		"1:",
> +		, , "memory");
>  #else
>  	unsigned int tmp;
>  
> -	asm volatile (
> +	ASM_CALL(
>  		UNWIND_HINT_SAVE
>  		"mov %%ss, %0\n\t"
>  		"pushq %q0\n\t"
> @@ -702,8 +701,8 @@ static inline void sync_core(void)
>  		"pushq $1f\n\t"
>  		"iretq\n\t"
>  		UNWIND_HINT_RESTORE
> -		"1:"
> -		: "=&r" (tmp), "+r" (__sp) : : "cc", "memory");
> +		"1:",
> +		"=&r" (tmp), , ARGS("cc", "memory"));
>  #endif
>  }
>  
> diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h
> index a34e0d4b957d..9d20f4fb5d7c 100644
> --- a/arch/x86/include/asm/rwsem.h
> +++ b/arch/x86/include/asm/rwsem.h
> @@ -99,25 +99,24 @@ static inline bool __down_read_trylock(struct rw_semaphore *sem)
>  /*
>   * lock for writing
>   */
> -#define ____down_write(sem, slow_path)			\
> -({							\
> -	long tmp;					\
> -	struct rw_semaphore* ret;			\
> -	register void *__sp asm(_ASM_SP);		\
> -							\
> -	asm volatile("# beginning down_write\n\t"	\
> -		     LOCK_PREFIX "  xadd      %1,(%4)\n\t"	\
> -		     /* adds 0xffff0001, returns the old value */ \
> -		     "  test " __ASM_SEL(%w1,%k1) "," __ASM_SEL(%w1,%k1) "\n\t" \
> -		     /* was the active mask 0 before? */\
> -		     "  jz        1f\n"			\
> -		     "  call " slow_path "\n"		\
> -		     "1:\n"				\
> -		     "# ending down_write"		\
> -		     : "+m" (sem->count), "=d" (tmp), "=a" (ret), "+r" (__sp) \
> -		     : "a" (sem), "1" (RWSEM_ACTIVE_WRITE_BIAS) \
> -		     : "memory", "cc");			\
> -	ret;						\
> +#define ____down_write(sem, slow_path)					\
> +({									\
> +	long tmp;							\
> +	struct rw_semaphore* ret;					\
> +									\
> +	ASM_CALL("# beginning down_write\n\t"				\
> +		 LOCK_PREFIX "  xadd      %1,(%3)\n\t"			\
> +		 /* adds 0xffff0001, returns the old value */		\
> +		 "  test " __ASM_SEL(%w1,%k1) "," __ASM_SEL(%w1,%k1) "\n\t" \
> +		 /* was the active mask 0 before? */			\
> +		 "  jz        1f\n"					\
> +		 "  call " slow_path "\n"				\
> +		 "1:\n"							\
> +		 "# ending down_write",					\
> +		 ARGS("+m" (sem->count), "=d" (tmp), "=a" (ret)),	\
> +		 ARGS("a" (sem), "1" (RWSEM_ACTIVE_WRITE_BIAS)),	\
> +		 ARGS("memory", "cc"));					\
> +	ret;								\
>  })
>  
>  static inline void __down_write(struct rw_semaphore *sem)
> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
> index 184eb9894dae..c7be076ee703 100644
> --- a/arch/x86/include/asm/uaccess.h
> +++ b/arch/x86/include/asm/uaccess.h
> @@ -166,12 +166,11 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
>  ({									\
>  	int __ret_gu;							\
>  	register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX);		\
> -	register void *__sp asm(_ASM_SP);				\
>  	__chk_user_ptr(ptr);						\
>  	might_fault();							\
> -	asm volatile("call __get_user_%P4"				\
> -		     : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp)	\
> -		     : "0" (ptr), "i" (sizeof(*(ptr))));		\
> +	ASM_CALL("call __get_user_%P3",					\
> +		 ARGS("=a" (__ret_gu), "=r" (__val_gu)),		\
> +		 ARGS("0" (ptr), "i" (sizeof(*(ptr)))));		\
>  	(x) = (__force __typeof__(*(ptr))) __val_gu;			\
>  	__builtin_expect(__ret_gu, 0);					\
>  })
> diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
> index b16f6a1d8b26..8b63e2cb1eaf 100644
> --- a/arch/x86/include/asm/uaccess_64.h
> +++ b/arch/x86/include/asm/uaccess_64.h
> @@ -38,10 +38,9 @@ copy_user_generic(void *to, const void *from, unsigned len)
>  			 X86_FEATURE_REP_GOOD,
>  			 copy_user_enhanced_fast_string,
>  			 X86_FEATURE_ERMS,
> -			 ASM_OUTPUT2("=a" (ret), "=D" (to), "=S" (from),
> -				     "=d" (len)),
> -			 "1" (to), "2" (from), "3" (len)
> -			 : "memory", "rcx", "r8", "r9", "r10", "r11");
> +			 ARGS("=a" (ret), "=D" (to), "=S" (from), "=d" (len)),
> +			 ARGS("1" (to), "2" (from), "3" (len)),
> +			 ARGS("memory", "rcx", "r8", "r9", "r10", "r11"));
>  	return ret;
>  }
>  
> diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h
> index 11071fcd630e..2feddeeec1d1 100644
> --- a/arch/x86/include/asm/xen/hypercall.h
> +++ b/arch/x86/include/asm/xen/hypercall.h
> @@ -114,9 +114,8 @@ extern struct { char _entry[32]; } hypercall_page[];
>  	register unsigned long __arg3 asm(__HYPERCALL_ARG3REG) = __arg3; \
>  	register unsigned long __arg4 asm(__HYPERCALL_ARG4REG) = __arg4; \
>  	register unsigned long __arg5 asm(__HYPERCALL_ARG5REG) = __arg5; \
> -	register void *__sp asm(_ASM_SP);
>  
> -#define __HYPERCALL_0PARAM	"=r" (__res), "+r" (__sp)
> +#define __HYPERCALL_0PARAM	"=r" (__res)
>  #define __HYPERCALL_1PARAM	__HYPERCALL_0PARAM, "+r" (__arg1)
>  #define __HYPERCALL_2PARAM	__HYPERCALL_1PARAM, "+r" (__arg2)
>  #define __HYPERCALL_3PARAM	__HYPERCALL_2PARAM, "+r" (__arg3)
> @@ -146,10 +145,10 @@ extern struct { char _entry[32]; } hypercall_page[];
>  ({									\
>  	__HYPERCALL_DECLS;						\
>  	__HYPERCALL_0ARG();						\
> -	asm volatile (__HYPERCALL					\
> -		      : __HYPERCALL_0PARAM				\
> -		      : __HYPERCALL_ENTRY(name)				\
> -		      : __HYPERCALL_CLOBBER0);				\
> +	ASM_CALL(__HYPERCALL,						\
> +		 __HYPERCALL_0PARAM,					\
> +		 __HYPERCALL_ENTRY(name),				\
> +		 __HYPERCALL_CLOBBER0);					\
>  	(type)__res;							\
>  })
>  
> @@ -157,10 +156,10 @@ extern struct { char _entry[32]; } hypercall_page[];
>  ({									\
>  	__HYPERCALL_DECLS;						\
>  	__HYPERCALL_1ARG(a1);						\
> -	asm volatile (__HYPERCALL					\
> -		      : __HYPERCALL_1PARAM				\
> -		      : __HYPERCALL_ENTRY(name)				\
> -		      : __HYPERCALL_CLOBBER1);				\
> +	ASM_CALL(__HYPERCALL,						\
> +		 __HYPERCALL_1PARAM,					\
> +		 __HYPERCALL_ENTRY(name),				\
> +		 __HYPERCALL_CLOBBER1);					\
>  	(type)__res;							\
>  })
>  
> @@ -168,10 +167,10 @@ extern struct { char _entry[32]; } hypercall_page[];
>  ({									\
>  	__HYPERCALL_DECLS;						\
>  	__HYPERCALL_2ARG(a1, a2);					\
> -	asm volatile (__HYPERCALL					\
> -		      : __HYPERCALL_2PARAM				\
> -		      : __HYPERCALL_ENTRY(name)				\
> -		      : __HYPERCALL_CLOBBER2);				\
> +	ASM_CALL(__HYPERCALL,						\
> +		 __HYPERCALL_2PARAM,					\
> +		 __HYPERCALL_ENTRY(name),				\
> +		 __HYPERCALL_CLOBBER2);					\
>  	(type)__res;							\
>  })
>  
> @@ -179,10 +178,10 @@ extern struct { char _entry[32]; } hypercall_page[];
>  ({									\
>  	__HYPERCALL_DECLS;						\
>  	__HYPERCALL_3ARG(a1, a2, a3);					\
> -	asm volatile (__HYPERCALL					\
> -		      : __HYPERCALL_3PARAM				\
> -		      : __HYPERCALL_ENTRY(name)				\
> -		      : __HYPERCALL_CLOBBER3);				\
> +	ASM_CALL(__HYPERCALL,						\
> +		 __HYPERCALL_3PARAM,					\
> +		 __HYPERCALL_ENTRY(name),				\
> +		 __HYPERCALL_CLOBBER3);					\
>  	(type)__res;							\
>  })
>  
> @@ -190,10 +189,10 @@ extern struct { char _entry[32]; } hypercall_page[];
>  ({									\
>  	__HYPERCALL_DECLS;						\
>  	__HYPERCALL_4ARG(a1, a2, a3, a4);				\
> -	asm volatile (__HYPERCALL					\
> -		      : __HYPERCALL_4PARAM				\
> -		      : __HYPERCALL_ENTRY(name)				\
> -		      : __HYPERCALL_CLOBBER4);				\
> +	ASM_CALL(__HYPERCALL,						\
> +		 __HYPERCALL_4PARAM,					\
> +		 __HYPERCALL_ENTRY(name),				\
> +		 __HYPERCALL_CLOBBER4);					\
>  	(type)__res;							\
>  })
>  
> @@ -201,10 +200,10 @@ extern struct { char _entry[32]; } hypercall_page[];
>  ({									\
>  	__HYPERCALL_DECLS;						\
>  	__HYPERCALL_5ARG(a1, a2, a3, a4, a5);				\
> -	asm volatile (__HYPERCALL					\
> -		      : __HYPERCALL_5PARAM				\
> -		      : __HYPERCALL_ENTRY(name)				\
> -		      : __HYPERCALL_CLOBBER5);				\
> +	ASM_CALL(__HYPERCALL,						\
> +		 __HYPERCALL_5PARAM,					\
> +		 __HYPERCALL_ENTRY(name),				\
> +		 __HYPERCALL_CLOBBER5);					\
>  	(type)__res;							\
>  })
>  
> @@ -218,10 +217,10 @@ privcmd_call(unsigned call,
>  	__HYPERCALL_5ARG(a1, a2, a3, a4, a5);
>  
>  	stac();
> -	asm volatile("call *%[call]"
> -		     : __HYPERCALL_5PARAM
> -		     : [call] "a" (&hypercall_page[call])
> -		     : __HYPERCALL_CLOBBER5);
> +	ASM_CALL("call *%[call]",
> +		 __HYPERCALL_5PARAM,
> +		[call] "a" (&hypercall_page[call]),
> +		__HYPERCALL_CLOBBER5);
>  	clac();
>  
>  	return (long)__res;
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index fb0055953fbc..4d0d326029a7 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -5284,16 +5284,15 @@ static void fetch_possible_mmx_operand(struct x86_emulate_ctxt *ctxt,
>  
>  static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *))
>  {
> -	register void *__sp asm(_ASM_SP);
>  	ulong flags = (ctxt->eflags & EFLAGS_MASK) | X86_EFLAGS_IF;
>  
>  	if (!(ctxt->d & ByteOp))
>  		fop += __ffs(ctxt->dst.bytes) * FASTOP_SIZE;
>  
> -	asm("push %[flags]; popf; call *%[fastop]; pushf; pop %[flags]\n"
> -	    : "+a"(ctxt->dst.val), "+d"(ctxt->src.val), [flags]"+D"(flags),
> -	      [fastop]"+S"(fop), "+r"(__sp)
> -	    : "c"(ctxt->src2.val));
> +	ASM_CALL("push %[flags]; popf; call *%[fastop]; pushf; pop %[flags]\n",
> +		 ARGS("+a"(ctxt->dst.val), "+d"(ctxt->src.val),
> +		      [flags]"+D"(flags), [fastop]"+S"(fop)),
> +		 ARGS("c"(ctxt->src2.val)));
>  
>  	ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK);
>  	if (!fop) /* exception is returned in fop variable */
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index ffd469ecad57..d42806ad3c9c 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -8671,7 +8671,6 @@ static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx)
>  static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
>  {
>  	u32 exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
> -	register void *__sp asm(_ASM_SP);
>  
>  	if ((exit_intr_info & (INTR_INFO_VALID_MASK | INTR_INFO_INTR_TYPE_MASK))
>  			== (INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR)) {
> @@ -8686,7 +8685,7 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
>  		vector =  exit_intr_info & INTR_INFO_VECTOR_MASK;
>  		desc = (gate_desc *)vmx->host_idt_base + vector;
>  		entry = gate_offset(*desc);
> -		asm volatile(
> +		ASM_CALL(
>  #ifdef CONFIG_X86_64
>  			"mov %%" _ASM_SP ", %[sp]\n\t"
>  			"and $0xfffffffffffffff0, %%" _ASM_SP "\n\t"
> @@ -8695,16 +8694,14 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
>  #endif
>  			"pushf\n\t"
>  			__ASM_SIZE(push) " $%c[cs]\n\t"
> -			"call *%[entry]\n\t"
> -			:
> +			"call *%[entry]\n\t",
>  #ifdef CONFIG_X86_64
> -			[sp]"=&r"(tmp),
> +			[sp]"=&r"(tmp)
>  #endif
> -			"+r"(__sp)
> -			:
> -			[entry]"r"(entry),
> -			[ss]"i"(__KERNEL_DS),
> -			[cs]"i"(__KERNEL_CS)
> +			,
> +			ARGS([entry]"r"(entry),
> +			     [ss]"i"(__KERNEL_DS),
> +			     [cs]"i"(__KERNEL_CS))
>  			);
>  	}
>  }
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index 2a1fa10c6a98..ca642ac3a390 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -802,7 +802,6 @@ no_context(struct pt_regs *regs, unsigned long error_code,
>  	if (is_vmalloc_addr((void *)address) &&
>  	    (((unsigned long)tsk->stack - 1 - address < PAGE_SIZE) ||
>  	     address - ((unsigned long)tsk->stack + THREAD_SIZE) < PAGE_SIZE)) {
> -		register void *__sp asm("rsp");
>  		unsigned long stack = this_cpu_read(orig_ist.ist[DOUBLEFAULT_STACK]) - sizeof(void *);
>  		/*
>  		 * We're likely to be running with very little stack space
> @@ -814,13 +813,12 @@ no_context(struct pt_regs *regs, unsigned long error_code,
>  		 * and then double-fault, though, because we're likely to
>  		 * break the console driver and lose most of the stack dump.
>  		 */
> -		asm volatile ("movq %[stack], %%rsp\n\t"
> -			      "call handle_stack_overflow\n\t"
> -			      "1: jmp 1b"
> -			      : "+r" (__sp)
> -			      : "D" ("kernel stack overflow (page fault)"),
> -				"S" (regs), "d" (address),
> -				[stack] "rm" (stack));
> +		ASM_CALL("movq %[stack], %%rsp\n\t"
> +			 "call handle_stack_overflow\n\t"
> +			 "1: jmp 1b",
> +			 ,
> +			 ARGS("D" ("kernel stack overflow (page fault)"),
> +			      "S" (regs), "d" (address), [stack] "rm" (stack)));
>  		unreachable();
>  	}
>  #endif
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 3f8c88e29a46..3a57f32a0bfd 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -589,4 +589,18 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
>  	(_________p1); \
>  })
>  
> +#define ARGS(args...) args
> +
> +#ifdef CONFIG_FRAME_POINTER
> +
> +#define ASM_CALL(str, outputs, inputs, clobbers...) \
> +	asm volatile(str : outputs : inputs : "sp", ## clobbers)
> +
> +#else
> +
> +#define ASM_CALL(str, outputs, inputs, clobbers...) \
> +	asm volatile(str : outputs : inputs : clobbers);
> +
> +#endif
> +
>  #endif /* __LINUX_COMPILER_H */
> diff --git a/tools/objtool/Documentation/stack-validation.txt b/tools/objtool/Documentation/stack-validation.txt
> index 6a1af43862df..766a00ebf80c 100644
> --- a/tools/objtool/Documentation/stack-validation.txt
> +++ b/tools/objtool/Documentation/stack-validation.txt
> @@ -193,13 +193,8 @@ they mean, and suggestions for how to fix them.
>  
>     If it's a GCC-compiled .c file, the error may be because the function
>     uses an inline asm() statement which has a "call" instruction.  An
> -   asm() statement with a call instruction must declare the use of the
> -   stack pointer in its output operand.  For example, on x86_64:
> -
> -     register void *__sp asm("rsp");
> -     asm volatile("call func" : "+r" (__sp));
> -
> -   Otherwise the stack frame may not get created before the call.
> +   asm() statement with a call instruction must use the ASM_CALL macro,
> +   which forces the frame pointer to be saved before the call.
>  
>  
>  2. file.o: warning: objtool: .text+0x53: unreachable instruction
> @@ -221,7 +216,7 @@ they mean, and suggestions for how to fix them.
>     change ENDPROC to END.
>  
>  
> -4. file.o: warning: objtool: func(): can't find starting instruction
> +3. file.o: warning: objtool: func(): can't find starting instruction
>     or
>     file.o: warning: objtool: func()+0x11dd: can't decode instruction
>  
> @@ -230,7 +225,7 @@ they mean, and suggestions for how to fix them.
>     section like .data or .rodata.
>  
>  
> -5. file.o: warning: objtool: func()+0x6: unsupported instruction in callable function
> +4. file.o: warning: objtool: func()+0x6: unsupported instruction in callable function
>  
>     This is a kernel entry/exit instruction like sysenter or iret.  Such
>     instructions aren't allowed in a callable function, and are most
> @@ -239,7 +234,7 @@ they mean, and suggestions for how to fix them.
>     annotated with the unwind hint macros in asm/unwind_hints.h.
>  
>  
> -6. file.o: warning: objtool: func()+0x26: sibling call from callable instruction with modified stack frame
> +5. file.o: warning: objtool: func()+0x26: sibling call from callable instruction with modified stack frame
>  
>     This is a dynamic jump or a jump to an undefined symbol.  Objtool
>     assumed it's a sibling call and detected that the frame pointer
> @@ -253,7 +248,7 @@ they mean, and suggestions for how to fix them.
>     the unwind hint macros in asm/unwind_hints.h.
>  
>  
> -7. file: warning: objtool: func()+0x5c: stack state mismatch
> +6. file: warning: objtool: func()+0x5c: stack state mismatch
>  
>     The instruction's frame pointer state is inconsistent, depending on
>     which execution path was taken to reach the instruction.
> @@ -270,7 +265,7 @@ they mean, and suggestions for how to fix them.
>     asm/unwind_hints.h.
>  
>  
> -8. file.o: warning: objtool: funcA() falls through to next function funcB()
> +7. file.o: warning: objtool: funcA() falls through to next function funcB()
>  
>     This means that funcA() doesn't end with a return instruction or an
>     unconditional jump, and that objtool has determined that the function

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ