lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 9 Feb 2016 13:06:17 +0100
From:	Borislav Petkov <bp@...en8.de>
To:	Andy Lutomirski <luto@...nel.org>
Cc:	X86 ML <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Brian Gerst <brgerst@...il.com>,
	Denys Vlasenko <dvlasenk@...hat.com>,
	Stas Sergeev <stsp@...t.ru>,
	Cyrill Gorcunov <gorcunov@...il.com>,
	Pavel Emelyanov <xemul@...allels.com>
Subject: Re: [PATCH v3 2/4] x86/signal/64: Fix SS if needed when delivering a
 64-bit signal

On Mon, Jan 25, 2016 at 01:34:13PM -0800, Andy Lutomirski wrote:
> Signals are always delivered to 64-bit tasks with CS set to a long
> mode segment.  In long mode, SS doesn't matter as long as it's a
> present writable segment.
> 
> If SS starts out invalid (this can happen if the signal was caused
> by an IRET fault or was delivered on the way out of set_thread_area
> or modify_ldt), then IRET to the signal handler can fail, eventually
> killing the task.
> 
> The straightforward fix would be to simply reset SS when delivering
> a signal.  That breaks DOSEMU, though: 64-bit builds of DOSEMU rely
> on SS being set to the faulting SS when signals are delivered.
> 
> As a compromise, this patch leaves SS alone so long as it's valid.
> 
> The net effect should be that the behavior of successfully delivered
> signals is unchanged.  Some signals that would previously have
> failed to be delivered will now be delivered successfully.
> 
> This has no effect for x32 or 32-bit tasks: their signal handlers
> were already called with SS == __USER_DS.
> 
> (On Xen, there's a slight hole: if a task sets SS to a writable
>  *kernel* data segment, then we will fail to identify it as invalid
>  and we'll still kill the task.  If anyone cares, this could be fixed
>  with a new paravirt hook.)
> 
> Signed-off-by: Andy Lutomirski <luto@...nel.org>
> ---
>  arch/x86/include/asm/desc_defs.h | 23 ++++++++++++++++++
>  arch/x86/kernel/signal.c         | 51 ++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 72 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/desc_defs.h b/arch/x86/include/asm/desc_defs.h
> index 278441f39856..00971705a16d 100644
> --- a/arch/x86/include/asm/desc_defs.h
> +++ b/arch/x86/include/asm/desc_defs.h
> @@ -98,4 +98,27 @@ struct desc_ptr {
>  
>  #endif /* !__ASSEMBLY__ */
>  
> +/* Access rights as returned by LAR */
> +#define AR_TYPE_RODATA		(0 * (1 << 9))
> +#define AR_TYPE_RWDATA		(1 * (1 << 9))
> +#define AR_TYPE_RODATA_EXPDOWN	(2 * (1 << 9))
> +#define AR_TYPE_RWDATA_EXPDOWN	(3 * (1 << 9))
> +#define AR_TYPE_XOCODE		(4 * (1 << 9))
> +#define AR_TYPE_XRCODE		(5 * (1 << 9))
> +#define AR_TYPE_XOCODE_CONF	(6 * (1 << 9))
> +#define AR_TYPE_XRCODE_CONF	(7 * (1 << 9))
> +#define AR_TYPE_MASK		(7 * (1 << 9))
> +
> +#define AR_DPL0			(0 * (1 << 13))
> +#define AR_DPL3			(3 * (1 << 13))
> +#define AR_DPL_MASK		(3 * (1 << 13))
> +
> +#define AR_A			(1 << 8)	/* A means "accessed" */
> +#define AR_S			(1 << 12)	/* S means "not system" */

Ah, with "not system" you want to say that S=0b makes it a system
descriptor and S=1b a user. I think the SDM calls it more descriptively
the "S (descriptor type) flag" while the APM calls it simply the S-field
or S-bit.

I like "S (descriptor type) flag" more than "not system". :)

> +#define AR_P			(1 << 15)	/* P means "present" */
> +#define AR_AVL			(1 << 20)	/* AVL does nothing */

AVL = AVaiLable to software

> +#define AR_L			(1 << 21)	/* L means "long mode" */
> +#define AR_DB			(1 << 22)	/* D or B, depending on type */
> +#define AR_G			(1 << 23)	/* G means "limit in pages" */

Please use the names from the processor manuals. G is the Granularity
bit. "limit in pages" is only clear to the people who have already read
the Granularity bit description. :-)

>  #endif /* _ASM_X86_DESC_DEFS_H */
> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
> index cb6282c3638f..bb3e4208d90d 100644
> --- a/arch/x86/kernel/signal.c
> +++ b/arch/x86/kernel/signal.c
> @@ -61,6 +61,35 @@
>  	regs->seg = GET_SEG(seg) | 3;			\
>  } while (0)
>  
> +#ifdef CONFIG_X86_64

You already have an

#else /* !CONFIG_X86_32 */

block above the 64-bit version of __setup_rt_frame(). Just put
force_valid_ss() there without that additional ifdef. That file's
ifdeffery is beyond any readability anyway.

> +/*
> + * If regs->ss will cause an IRET fault, change it.  Otherwise leave it
> + * alone.  Using this generally makes no sense unless
> + * user_64bit_mode(regs) would return true.
> + */
> +static void force_valid_ss(struct pt_regs *regs)
> +{
> +	u32 ar;
> +	asm volatile ("lar %[old_ss], %[ar]\n\t"
> +		      "jz 1f\n\t"		/* If invalid: */
> +		      "xorl %[ar], %[ar]\n\t"	/* set ar = 0 */
> +		      "1:"
> +		      : [ar] "=r" (ar)
> +		      : [old_ss] "rm" ((u16)regs->ss));
> +
> +	/*
> +	 * For a valid 64-bit user context, we need DPL 3, type
> +	 * read-write data or read-write exp-down data, and S and P
> +	 * set.  We can't use VERW because VERW doesn't check the
> +	 * P bit.
> +	 */
> +	ar &= AR_DPL_MASK | AR_S | AR_P | AR_TYPE_MASK;
> +	if (ar != (AR_DPL3 | AR_S | AR_P | AR_TYPE_RWDATA) &&
> +	    ar != (AR_DPL3 | AR_S | AR_P | AR_TYPE_RWDATA_EXPDOWN))
> +		regs->ss = __USER_DS;
> +}
> +#endif
> +
>  int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc)
>  {
>  	unsigned long buf_val;
> @@ -459,10 +488,28 @@ static int __setup_rt_frame(int sig, struct ksignal *ksig,
>  
>  	regs->sp = (unsigned long)frame;
>  
> -	/* Set up the CS register to run signal handlers in 64-bit mode,
> -	   even if the handler happens to be interrupting 32-bit code. */
> +	/*
> +	 * Set up the CS and SS registers to run signal handlers in
> +	 * 64-bit mode, even if the handler happens to be interrupting
> +	 * 32-bit or 16-bit code.
> +	 *
> +	 * SS is subtle.  In 64-bit mode, we don't need any particular
> +	 * SS descriptor, but we do need SS to be valid.  It's possible
> +	 * that the old SS is entirely bogus -- this can happen if the
> +	 * signal we're trying to deliver is #GP or #SS caused by a bad
> +	 * SS value.  We also have a compatbility issue here: DOSEMU
> +	 * relies on the contents of the SS register indicating the
> +	 * SS value at the time of the signal, even though that code in
> +	 * DOSEMU predates sigreturn's ability to restore SS.  (DOSEMU
> +	 * avoids relying on sigreturn to restore SS; instead it uses
> +	 * a trampoline.)  So we do our best: if the old SS was valid,
> +	 * we keep it.  Otherwise we replace it.
> +	 */
>  	regs->cs = __USER_CS;
>  
> +	if (unlikely(regs->ss != __USER_DS))

So this is fast path AFAICT and from adding a gdb breakpoint here.

I guess we can't do the opt-in behavior and patch it out when users
don't want to run dosemu.

Or maybe we could add a CONFIG_CHECK_OLD_SS which is default y and
people can disable it... so an opt-out behavior :)

Hmmm...

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ