lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 16 Jun 2017 18:34:29 -0700
From:   Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
To:     Borislav Petkov <bp@...e.de>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Andy Lutomirski <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Brian Gerst <brgerst@...il.com>,
        Chris Metcalf <cmetcalf@...lanox.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Huang Rui <ray.huang@....com>, Jiri Slaby <jslaby@...e.cz>,
        Jonathan Corbet <corbet@....net>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Paul Gortmaker <paul.gortmaker@...driver.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Chen Yucong <slaoub@...il.com>,
        Alexandre Julliard <julliard@...ehq.org>,
        Stas Sergeev <stsp@...t.ru>, Fenghua Yu <fenghua.yu@...el.com>,
        "Ravi V. Shankar" <ravi.v.shankar@...el.com>,
        Shuah Khan <shuah@...nel.org>, linux-kernel@...r.kernel.org,
        x86@...nel.org, linux-msdos@...r.kernel.org, wine-devel@...ehq.org,
        Tony Luck <tony.luck@...el.com>
Subject: Re: [PATCH v7 21/26] x86: Add emulation code for UMIP instructions

On Thu, 2017-06-08 at 20:38 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:19AM -0700, Ricardo Neri wrote:
> > The feature User-Mode Instruction Prevention present in recent Intel
> > processor prevents a group of instructions from being executed with
> > CPL > 0. Otherwise, a general protection fault is issued.
> 
> This is one of the best opening paragraphs of a commit message I've
> read this year! This is how you open: short, succinct, to the point, no
> marketing bullshit. Good!

Thanks you!
> 
> > Rather than relaying this fault to the user space (in the form of a SIGSEGV
> > signal), the instructions protected by UMIP can be emulated to provide
> > dummy results. This allows to conserve the current kernel behavior and not
> > reveal the system resources that UMIP intends to protect (the global
> > descriptor and interrupt descriptor tables, the segment selectors of the
> > local descriptor table and the task state and the machine status word).
> > 
> > This emulation is needed because certain applications (e.g., WineHQ and
> > DOSEMU2) rely on this subset of instructions to function.
> > 
> > The instructions protected by UMIP can be split in two groups. Those who
> 
> s/who/which/

I will correct.
> 
> > return a kernel memory address (sgdt and sidt) and those who return a
> 
> ditto.

I will correct here also.
> 
> > value (sldt, str and smsw).
> >
> > For the instructions that return a kernel memory address, applications
> > such as WineHQ rely on the result being located in the kernel memory space.
> > The result is emulated as a hard-coded value that, lies close to the top
> > of the kernel memory. The limit for the GDT and the IDT are set to zero.
> 
> Nice.
> 
> > Given that sldt and str are not used in common in programs supported by
> 
> You wanna say "in common programs" here? Or "not commonly used in programs" ?

I will rephrase this comment.
> 
> > WineHQ and DOSEMU2, they are not emulated.
> > 
> > The instruction smsw is emulated to return the value that the register CR0
> > has at boot time as set in the head_32.
> > 
> > Care is taken to appropriately emulate the results when segmentation is
> > used. This is, rather than relying on USER_DS and USER_CS, the function
> 
> 	"That is,... "

I will correct it.
> 
> > insn_get_addr_ref() inspects the segment descriptor pointed by the
> > registers in pt_regs. This ensures that we correctly obtain the segment
> > base address and the address and operand sizes even if the user space
> > application uses local descriptor table.
> 
> Btw, I could very well use all that nice explanation in umip.c too so
> that the high-level behavior is documented.

Sure, I will include a high-level description in the file itself.

> 
> > Cc: Andy Lutomirski <luto@...nel.org>
> > Cc: Andrew Morton <akpm@...ux-foundation.org>
> > Cc: H. Peter Anvin <hpa@...or.com>
> > Cc: Borislav Petkov <bp@...e.de>
> > Cc: Brian Gerst <brgerst@...il.com>
> > Cc: Chen Yucong <slaoub@...il.com>
> > Cc: Chris Metcalf <cmetcalf@...lanox.com>
> > Cc: Dave Hansen <dave.hansen@...ux.intel.com>
> > Cc: Fenghua Yu <fenghua.yu@...el.com>
> > Cc: Huang Rui <ray.huang@....com>
> > Cc: Jiri Slaby <jslaby@...e.cz>
> > Cc: Jonathan Corbet <corbet@....net>
> > Cc: Michael S. Tsirkin <mst@...hat.com>
> > Cc: Paul Gortmaker <paul.gortmaker@...driver.com>
> > Cc: Peter Zijlstra <peterz@...radead.org>
> > Cc: Ravi V. Shankar <ravi.v.shankar@...el.com>
> > Cc: Shuah Khan <shuah@...nel.org>
> > Cc: Vlastimil Babka <vbabka@...e.cz>
> > Cc: Tony Luck <tony.luck@...el.com>
> > Cc: Paolo Bonzini <pbonzini@...hat.com>
> > Cc: Liang Z. Li <liang.z.li@...el.com>
> > Cc: Alexandre Julliard <julliard@...ehq.org>
> > Cc: Stas Sergeev <stsp@...t.ru>
> > Cc: x86@...nel.org
> > Cc: linux-msdos@...r.kernel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
> > ---
> >  arch/x86/include/asm/umip.h |  15 +++
> >  arch/x86/kernel/Makefile    |   1 +
> >  arch/x86/kernel/umip.c      | 245 ++++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 261 insertions(+)
> >  create mode 100644 arch/x86/include/asm/umip.h
> >  create mode 100644 arch/x86/kernel/umip.c
> > 
> > diff --git a/arch/x86/include/asm/umip.h b/arch/x86/include/asm/umip.h
> > new file mode 100644
> > index 0000000..077b236
> > --- /dev/null
> > +++ b/arch/x86/include/asm/umip.h
> > @@ -0,0 +1,15 @@
> > +#ifndef _ASM_X86_UMIP_H
> > +#define _ASM_X86_UMIP_H
> > +
> > +#include <linux/types.h>
> > +#include <asm/ptrace.h>
> > +
> > +#ifdef CONFIG_X86_INTEL_UMIP
> > +bool fixup_umip_exception(struct pt_regs *regs);
> > +#else
> > +static inline bool fixup_umip_exception(struct pt_regs *regs)
> > +{
> > +	return false;
> > +}
> 
> Let's save some header lines:
> 
> static inline bool fixup_umip_exception(struct pt_regs *regs) 	{ return false; }
> 
> those trunks take too much space as it is.

I will correct.
> 
> > +#endif  /* CONFIG_X86_INTEL_UMIP */
> > +#endif  /* _ASM_X86_UMIP_H */
> > diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> > index 4b99423..cc1b7cc 100644
> > --- a/arch/x86/kernel/Makefile
> > +++ b/arch/x86/kernel/Makefile
> > @@ -123,6 +123,7 @@ obj-$(CONFIG_EFI)			+= sysfb_efi.o
> >  obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o
> >  obj-$(CONFIG_TRACING)			+= tracepoint.o
> >  obj-$(CONFIG_SCHED_MC_PRIO)		+= itmt.o
> > +obj-$(CONFIG_X86_INTEL_UMIP)		+= umip.o
> >  
> >  ifdef CONFIG_FRAME_POINTER
> >  obj-y					+= unwind_frame.o
> > diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
> > new file mode 100644
> > index 0000000..c7c5795
> > --- /dev/null
> > +++ b/arch/x86/kernel/umip.c
> > @@ -0,0 +1,245 @@
> > +/*
> > + * umip.c Emulation for instruction protected by the Intel User-Mode
> > + * Instruction Prevention. The instructions are:
> > + *    sgdt
> > + *    sldt
> > + *    sidt
> > + *    str
> > + *    smsw
> > + *
> > + * Copyright (c) 2017, Intel Corporation.
> > + * Ricardo Neri <ricardo.neri@...ux.intel.com>
> > + */
> > +
> > +#include <linux/uaccess.h>
> > +#include <asm/umip.h>
> > +#include <asm/traps.h>
> > +#include <asm/insn.h>
> > +#include <asm/insn-eval.h>
> > +#include <linux/ratelimit.h>
> > +
> > +/*
> > + * == Base addresses of GDT and IDT
> > + * Some applications to function rely finding the global descriptor table (GDT)
> 
> That formulation reads funny.

I will correct.

> 
> > + * and the interrupt descriptor table (IDT) in kernel memory.
> > + * For x86_32, the selected values do not match any particular hole, but it
> > + * suffices to provide a memory location within kernel memory.
> > + *
> > + * == CRO flags for SMSW
> > + * Use the flags given when booting, as found in head_32.S
> > + */
> > +
> > +#define CR0_STATE (X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | X86_CR0_NE | \
> > +		   X86_CR0_WP | X86_CR0_AM)
> 
> Why not pull those up in asm/processor-flags.h or so and share the
> definition instead of duplicating it?

Sure, I will relocate this definition.
> 
> > +#define UMIP_DUMMY_GDT_BASE 0xfffe0000
> > +#define UMIP_DUMMY_IDT_BASE 0xffff0000
> > +
> > +enum umip_insn {
> > +	UMIP_SGDT = 0,	/* opcode 0f 01 ModR/M reg 0 */
> > +	UMIP_SIDT,	/* opcode 0f 01 ModR/M reg 1 */
> > +	UMIP_SLDT,	/* opcode 0f 00 ModR/M reg 0 */
> > +	UMIP_SMSW,	/* opcode 0f 01 ModR/M reg 4 */
> > +	UMIP_STR,	/* opcode 0f 00 ModR/M reg 1 */
> 
> Let's stick to a single spelling: ModRM.reg=0, etc.
> 
> Better yet, use the SDM format:
> 
> 	UMIP_SGDT = 0,		/* 0F 01 /0 */
> 	UMIP_SIDT,		/* 0F 01 /1 */
> 	...
> 

I will update accordingly.

> > +};
> > +
> > +/**
> > + * __identify_insn() - Identify a UMIP-protected instruction
> > + * @insn:	Instruction structure with opcode and ModRM byte.
> > + *
> > + * From the instruction opcode and the reg part of the ModRM byte, identify,
> > + * if any, a UMIP-protected instruction.
> > + *
> > + * Return: an enumeration of a UMIP-protected instruction; -EINVAL on failure.
> > + */
> > +static int __identify_insn(struct insn *insn)
> 
> static enum umip_insn __identify_insn(...
> 
> But frankly, that enum looks pointless to me - it is used locally only
> and you can just as well use plain ints.

I will change to plain ints.
> 
> > +{
> > +	/* By getting modrm we also get the opcode. */
> > +	insn_get_modrm(insn);
> > +
> > +	/* All the instructions of interest start with 0x0f. */
> > +	if (insn->opcode.bytes[0] != 0xf)
> > +		return -EINVAL;
> > +
> > +	if (insn->opcode.bytes[1] == 0x1) {
> > +		switch (X86_MODRM_REG(insn->modrm.value)) {
> > +		case 0:
> > +			return UMIP_SGDT;
> > +		case 1:
> > +			return UMIP_SIDT;
> > +		case 4:
> > +			return UMIP_SMSW;
> > +		default:
> > +			return -EINVAL;
> > +		}
> > +	}
> > +	/* SLDT AND STR are not emulated */
> > +	return -EINVAL;
> > +}
> > +
> > +/**
> > + * __emulate_umip_insn() - Emulate UMIP instructions with dummy values
> > + * @insn:	Instruction structure with ModRM byte
> > + * @umip_inst:	Instruction to emulate
> > + * @data:	Buffer onto which the dummy values will be copied
> > + * @data_size:	Size of the emulated result
> > + *
> > + * Emulate an instruction protected by UMIP. The result of the emulation
> > + * is saved in the provided buffer. The size of the results depends on both
> > + * the instruction and type of operand (register vs memory address). Thus,
> > + * the size of the result needs to be updated.
> > + *
> > + * Result: 0 if success, -EINVAL on failure to emulate
> > + */
> > +static int __emulate_umip_insn(struct insn *insn, enum umip_insn umip_inst,
> > +			       unsigned char *data, int *data_size)
> > +{
> > +	unsigned long dummy_base_addr;
> > +	unsigned short dummy_limit = 0;
> > +	unsigned int dummy_value = 0;
> > +
> > +	switch (umip_inst) {
> > +	/*
> > +	 * These two instructions return the base address and limit of the
> > +	 * global and interrupt descriptor table. The base address can be
> > +	 * 24-bit, 32-bit or 64-bit. Limit is always 16-bit. If the operand
> > +	 * size is 16-bit the returned value of the base address is supposed
> > +	 * to be a zero-extended 24-byte number. However, it seems that a
> > +	 * 32-byte number is always returned in legacy protected mode
> > +	 * irrespective of the operand size.
> > +	 */
> > +	case UMIP_SGDT:
> > +		/* fall through */
> > +	case UMIP_SIDT:
> > +		if (umip_inst == UMIP_SGDT)
> > +			dummy_base_addr = UMIP_DUMMY_GDT_BASE;
> > +		else
> > +			dummy_base_addr = UMIP_DUMMY_IDT_BASE;
> > +		if (X86_MODRM_MOD(insn->modrm.value) == 3) {
> > +			/* SGDT and SIDT do not take register as argument. */
> 
> Comment above the if.

I will correct.

> 
> > +			return -EINVAL;
> > +		}
> 
> So that check needs to go first, then the dummy_base_addr assignment.

I will rearrange.

> 
> > +
> > +		memcpy(data + 2, &dummy_base_addr, sizeof(dummy_base_addr));
> > +		memcpy(data, &dummy_limit, sizeof(dummy_limit));
> > +		*data_size = sizeof(dummy_base_addr) + sizeof(dummy_limit);
> 
> Huh, that value will always be the same - why do you have a specific
> variable? It could be a define, once for 32-bit and once for 64-bit.

Sure. I will use #define's.

> > +		break;
> > +	case UMIP_SMSW:
> > +		/*
> > +		 * Even though CR0_STATE contain 4 bytes, the number
> > +		 * of bytes to be copied in the result buffer is determined
> > +		 * by whether the operand is a register or a memory location.
> > +		 */
> > +		dummy_value = CR0_STATE;
> 
> Something's wrong here: how does that local, write-only variable have
> any effect?

Ah yes, initially SMSW, SLDT and STR were handled equally. Since I
removed support for the last two, I inadvertently removed the code that
copies the result of SMSW. I will re-add it.
 
> 
> > +		/*
> > +		 * These two instructions return a 16-bit value. We return
> > +		 * all zeros. This is equivalent to a null descriptor for
> > +		 * str and sldt.
> > +		 */
> > +		/* SLDT and STR are not emulated */
> > +		/* fall through */
> > +	case UMIP_SLDT:
> > +		/* fall through */
> > +	case UMIP_STR:
> > +		/* fall through */
> > +	default:
> > +		return -EINVAL;
> 
> That switch-case has a majority of fall-throughs. So make it an if-else
> instead.

Sure, I will update.

> 
> > +	}
> > +	return 0;
> > +}
> > +
> > +/**
> > + * fixup_umip_exception() - Fixup #GP faults caused by UMIP
> > + * @regs:	Registers as saved when entering the #GP trap
> > + *
> > + * The instructions sgdt, sidt, str, smsw, sldt cause a general protection
> > + * fault if with CPL > 0 (i.e., from user space). This function can be
> > + * used to emulate the results of the aforementioned instructions with
> > + * dummy values. Results are copied to user-space memory as indicated by
> > + * the instruction pointed by EIP using the registers indicated in the
> > + * instruction operands. This function also takes care of determining
> > + * the address to which the results must be copied.
> > + */
> > +bool fixup_umip_exception(struct pt_regs *regs)
> > +{
> > +	struct insn insn;
> > +	unsigned char buf[MAX_INSN_SIZE];
> > +	/* 10 bytes is the maximum size of the result of UMIP instructions */
> > +	unsigned char dummy_data[10] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
> 
> 	unsigned char dummy_data[10] = { 0 };
> 
> One 0 should be enough :)

Right. I will update.

> 
> > +	unsigned long seg_base;
> > +	int not_copied, nr_copied, reg_offset, dummy_data_size;
> > +	void __user *uaddr;
> > +	unsigned long *reg_addr;
> > +	enum umip_insn umip_inst;
> > +	struct insn_code_seg_defaults seg_defs;
> 
> Please sort function local variables declaration in a reverse christmas
> tree order:
> 
> 	<type> longest_variable_name;
> 	<type> shorter_var_name;
> 	<type> even_shorter;
> 	<type> i;
> 
I will rearrange my variables.

> > +
> > +	/*
> > +	 * Use the segment base in case user space used a different code
> > +	 * segment, either in protected (e.g., from an LDT) or virtual-8086
> > +	 * modes. In most of the cases seg_base will be zero as in USER_CS.
> > +	 */
> > +	seg_base = insn_get_seg_base(regs, &insn,
> > +				     offsetof(struct pt_regs, ip));
> 
> Oh boy, where's the error handling?! That can return -1L.
> 
> > +	not_copied = copy_from_user(buf, (void __user *)(seg_base + regs->ip),
> 
> -1L + regs->ip is then your pwnage.

I will add the error handling code.
> 
> > +				    sizeof(buf));
> 
> Just let them stick out.

Sure.

> 
> > +	nr_copied = sizeof(buf) - not_copied;
> 
> <---- newline here.

I will add the new line.
> 
> > +	/*
> > +	 * The copy_from_user above could have failed if user code is protected
> 			    ()
> 
> > +	 * by a memory protection key. Give up on emulation in such a case.
> > +	 * Should we issue a page fault?
> 
> Why? AFAICT, you're in the #GP handler. Simply you return unhandled.

If I returned unhandled, a SIGSEGV will be sent to the user space
application but siginfo will look like a #GP. However, memory protection
keys cause page faults and siginfo is filled differently.

> 
> > +	 */
> > +	if (!nr_copied)
> > +		return false;
> > +
> > +	insn_init(&insn, buf, nr_copied, user_64bit_mode(regs));
> > +
> > +	/*
> > +	 * Override the default operand and address sizes to what is specified
> > +	 * in the code segment descriptor. The instruction decoder only sets
> > +	 * the address size it to either 4 or 8 address bytes and does nothing
> > +	 * for the operand bytes. This OK for most of the cases, but we could
> > +	 * have special cases where, for instance, a 16-bit code segment
> > +	 * descriptor is used.
> > +	 * If there are overrides, the instruction decoder correctly updates
> > +	 * these values, even for 16-bit defaults.
> > +	 */
> > +	seg_defs = insn_get_code_seg_defaults(regs);
> > +	insn.addr_bytes = seg_defs.address_bytes;
> > +	insn.opnd_bytes = seg_defs.operand_bytes;
> > +
> > +	if (!insn.addr_bytes || !insn.opnd_bytes)
> > +		return false;
> > +
> > +	if (user_64bit_mode(regs))
> > +		return false;
> > +
> > +	insn_get_length(&insn);
> > +	if (nr_copied < insn.length)
> > +		return false;
> > +
> > +	umip_inst = __identify_insn(&insn);
> > +	/* Check if we found an instruction protected by UMIP */
> 
> Put comment above the function call.

Will do.

> 
> > +	if (umip_inst < 0)
> > +		return false;
> > +
> > +	if (__emulate_umip_insn(&insn, umip_inst, dummy_data, &dummy_data_size))
> > +		return false;
> > +
> > +	/* If operand is a register, write directly to it */
> > +	if (X86_MODRM_MOD(insn.modrm.value) == 3) {
> > +		reg_offset = insn_get_modrm_rm_off(&insn, regs);
> 
> Grr, error handling!! That reg_offset can be -E<something>.

I will add the error handling code.

> 
> > +		reg_addr = (unsigned long *)((unsigned long)regs + reg_offset);
> > +		memcpy(reg_addr, dummy_data, dummy_data_size);
> > +	} else {
> > +		uaddr = insn_get_addr_ref(&insn, regs);
> > +		/* user address could not be determined, abort emulation */
> 
> That comment is kinda obvious. But yes, this has error handling.

OK, I will remove this comment.

Many thanks for your detailed review!

BR,
Ricardo

Powered by blists - more mailing lists