linux-kernel - [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1507089272-32733-1-git-send-email-ricardo.neri-calderon@linux.intel.com>
Date:   Tue,  3 Oct 2017 20:54:03 -0700
From:   Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
To:     Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Andy Lutomirski <luto@...nel.org>, Borislav Petkov <bp@...e.de>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Brian Gerst <brgerst@...il.com>,
        Chris Metcalf <cmetcalf@...lanox.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Liang Z Li <liang.z.li@...el.com>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Huang Rui <ray.huang@....com>, Jiri Slaby <jslaby@...e.cz>,
        Jonathan Corbet <corbet@....net>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Paul Gortmaker <paul.gortmaker@...driver.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Chen Yucong <slaoub@...il.com>,
        "Ravi V. Shankar" <ravi.v.shankar@...el.com>,
        Shuah Khan <shuah@...nel.org>, linux-kernel@...r.kernel.org,
        x86@...nel.org, ricardo.neri@...el.com,
        Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
Subject: [PATCH v9 00/29] x86: Enable User-Mode Instruction Prevention

This is v9 of this series. The seven previous submissions can be found
here [1], here [2], here[3], here[4], here[5], here[6], here[7] and here[8].
This version addresses the feedback comments from Borislav Petkov received on
v7. Please see details in the change log.

=== What is UMIP?

User-Mode Instruction Prevention (UMIP) is a security feature present in
new Intel Processors. If enabled, it prevents the execution of certain
instructions if the Current Privilege Level (CPL) is greater than 0. If
these instructions were executed while in CPL > 0, user space applications
could have access to system-wide settings such as the global and local
descriptor tables, the segment selectors to the current task state and the
local descriptor table. Hiding these system resources reduces the tools
available to craft privilege escalation attacks such as [9].

These are the instructions covered by UMIP:
* SGDT - Store Global Descriptor Table
* SIDT - Store Interrupt Descriptor Table
* SLDT - Store Local Descriptor Table
* SMSW - Store Machine Status Word
* STR - Store Task Register

If any of these instructions is executed with CPL > 0, a general protection
exception is issued when UMIP is enabled.

=== How does it impact applications?

When enabled, However, UMIP will change the behavior that certain
applications expect from the operating system. For instance, programs
running on WineHQ and DOSEMU2 rely on some of these instructions to
function. Stas Sergeev found that Microsoft Windows 3.1 and dos4gw use the
instruction SMSW when running in virtual-8086 mode[10]. SGDT and SIDT can
also be used on virtual-8086 mode.

In order to not change the behavior of the system. This patchset emulates
SGDT, SIDT and SMSW. This should be sufficient to not break the
applications mentioned above. Regarding the two remaining instructions, STR
and SLDT, the WineHQ team has shown interest catching the general protection
fault and use it as a vehicle to fix broken applications[11]. Furthermore,
STR and SLDT can only run in protected and long modes.

DOSEMU2 emulates virtual-8086 mode via KVM. No applications will be broken
unless DOSEMU2 decides to enable the CR4.UMIP bit in platforms that support
it. Also, this should not pose a security risk as no system resouces would
be revealed. Instead, code running inside the KVM would only see the KVM's
GDT, IDT and MSW.

Please note that UMIP is always enabled for both 64-bit and 32-bit Linux
builds. However, emulation of the UMIP-protected instructions is not done
for 64-bit processes. 64-bit user space applications will receive the
SIGSEGV signal when UMIP instructions causes a general protection fault.

=== How are UMIP-protected instructions emulated?

UMIP is kept enabled at all times when the CONFIG_x86_INTEL_UMIP option is
selected. If a general protection fault caused by the instructions
protected by UMIP is detected, such fault will be trapped and fixed-up. The
return values will be dummy as follows:
 
 * SGDT and SIDT return hard-coded dummy values as the base of the global
   descriptor and interrupt descriptor tables. These hard-coded values
   correspond to memory addresses that are near the end of the kernel
   memory map. This is also the case for virtual-8086 mode tasks. In all
   my experiments with 32-bit processes, the base of GDT and IDT was always
   a 4-byte address, even for 16-bit operands. Thus, my emulation code does
   the same. In all cases, the limit of the table is set to 0.
 * SMSW returns the value with which the CR0 register is programmed in
   head_32/64.S at boot time. This is, the following bits are enabled:
   CR0.0 for Protection Enable, CR.1 for Monitor Coprocessor, CR.4 for
   Extension Type, which will always be 1 in recent processors with UMIP;
   CR.5 for Numeric Error, CR0.16 for Write Protect, CR0.18 for Alignment
   Mask and CR0.31 for Paging. As per the Intel 64 and IA-32 Architectures
   Software Developer's Manual, SMSW returns a 16-bit results for memory
   operands. However, when the operand is a register, the results can be up
   to CR0[63:0]. Since the emulation code only kicks-in for 32-bit
   processes, we return up to CR[31:0].
 * The proposed emulation code is handles faults that happens in both
   protected and virtual-8086 mode.
 * Again, STR and SLDT are not emulated.

=== How is this series laid out?

++ Preparatory work
As per suggestions from Andy Lutormirsky and Borislav Petkov, I moved
the x86 page fault error codes to a header. Also, I made user_64bit_mode
available to x86_32 builds. This helps to reuse code and reduce the number
of #ifdef's in these patches. Borislav also suggested to uprobes should use
the existing definitions in arch/x86/include/asm/inat.h instead of hard-
coded values when checking instruction prefixes. I included this change
in the series.

++ Fix bugs in MPX address decoder
I found very useful the code for Intel MPX (Memory Protection Extensions)
used to parse opcodes and the memory locations contained in the general
purpose registers when used as operands. I put this code in a separate
library file that both MPX, UMIP and potentially others can access and
avoid code duplication.

Before creating the new library, I fixed a couple of bugs that I found in
in corner cases on how MPX determines the address contained in the
instruction and operands.

++ Provide a new x86 instruction evaluating library
With bugs fixed, the MPX evaluating code is relocated in a new insn-eval.c
library. The basic functionality of this library is extended to obtain the
segment descriptor selected by either segment override prefixes or the
default segment by the involved registers in the calculation of the
effective address. It was also extended to obtain the default address and
operand sizes as well as the segment base address. Also, support to 
process 16-bit address encodings. Armed with this arsenal, it is now
possible to determine the linear address onto which the emulated results
shall be copied. Furthermore, this new library relies on and extends the
capabilities of the existing instruction decoder in arch/x86/lib/insn.c.

This code supports long mode with 32 and 64 bit addresses, protected mode
with 16 and 32 bit addresses and virtual-8086 mode with 16 and 32 bit
addresses. Both global and local descriptor tables are supported.
Segmentation is supported in protected mode; in long mode, is supported
via the FS and GS registers.

++ Emulate UMIP instructions
A new fixup_umip_exception() functions inspect the instruction at the
instruction pointer. If it is an UMIP-protected instruction, it executes
the emulation code. This uses all the address-computing code of the
previous section.

++ Add self-tests
Lastly, self-tests are added to entry_from_v86.c to exercise the most
typical use cases of UMIP-protected instructions in a virtual-8086 mode.

++ Extensive tests
Extensive tests were performed to test all the combinations of ModRM,
SiB and displacements for 16-bit and 32-bit encodings for the SS, DS,
ES, FS and GS segments. Tests also include a 64-bit program that uses
segmentation via FS and GS. For this purpose, I temporarily enabled UMIP
support for 64-bit process. This change is not part of this patchset.
The intention is to test the computations of linear addresses in 64-bit
mode, including the extra R8-R15 registers. Extensive test is also
implemented for virtual-8086 tasks. Code of these tests can be found here
[12] and here [13].

++ Merging this series?
Eight versions of this series have been submitted. Am I any close to see
these patches merged? :)
 
[1]. https://lwn.net/Articles/705877/
[2]. https://lkml.org/lkml/2016/12/23/265
[3]. https://lkml.org/lkml/2017/1/25/622
[4]. https://lkml.org/lkml/2017/2/23/40
[5]. https://lkml.org/lkml/2017/3/3/678
[6]. https://lkml.org/lkml/2017/3/7/866
[7]. https://lkml.org/lkml/2017/5/5/398
[8]. https://lkml.org/lkml/2017/8/18/992
[9]. http://timetobleed.com/a-closer-look-at-a-recent-privilege-escalation-bug-in-linux-cve-2013-2094/
[10]. https://www.winehq.org/pipermail/wine-devel/2017-April/117159.html
[11]. https://marc.info/?l=linux-kernel&m=147876798717927&w=2
[12]. https://github.com/01org/luv-yocto/tree/rneri/umip/meta-luv/recipes-core/umip/files
[13]. https://github.com/01org/luv-yocto/commit/a72a7fe7d68693c0f4100ad86de6ecabde57334f#diff-3860c136a63add269bce4ea50222c248R1

Thanks and BR,
Ricardo

Changes since V8:
*Simplified error handling in the family of get_addr_ref_xx functions
 by initializing linear address to -1L.
*Reworded commit that #define's an initial state of CR0 and removed unneeded
 comment.
*Reworked get_desc() to get rid of one mutex_unlock(). Used a new local variable
 to improve readability.
*Reworked the utility functions used to obtain the segment selector:
  + get_overridden_seg_reg_idx() now only inspects the instruction to find
    segment override prefixes.
  + A new function allow_seg_reg_overrides() determines if segment override
    prefixes can be used based on the register operand in use and the nature of
    the instruction (i.e., string instructions vs not).
  + resolve_seg_reg() uses the two functions above, along with user_64bit_mode()
    to resolve the segment register index: overridden, default or ignored.
*Renamed local variables to reflect the fact that our segment registers are
 indexes and not the actual hardware regiters.
*Reworded function documentation for improved readability.

Changes since V7:
*UMIP is not enabled by default.
*Relocated definition of the initial state of CR0 into processor-flags.h
*Updated uprobes to use the autogenerated INAT_PFX_xS definitions instead of
 hard-coded values.
*In insn-eval.c, refer to segment override prefixes using the autogenerated
 INAT_PFX_XS definitions.
*Removed enumeration for segment registers that reused the segment override
 instruction prefixes. Instead, a new, separate, set of #defines is used in
 arch/x86/include/asm/inat.h
*Simplified function to identify string instruction.
*Split the code usde to determine the relevant segment register into two
 functions: one to inspect segment overrides and a second one to determine
 default segment registers based on the instruction and operands. A third
 functions reads the segment register to obtain the segment selector.
*Reworked arithmetic to compute 32-bit and 64-bit effective addresses. Instead
 of type casts, two separate functions are used in each case.
*Removed structure to hold segment default address and operand sizes. Used
 #defines instead.
*Corrected bug when determining the limit of a segment.
*Updated various functions to use error codes from errno-base.h
*Replaced prink_ratelimited with pr_err_ratelimited.
*Corrected typos and format errors in functions' documentation.
*Fixed unimplemented handling of emulation of the SMSW instruction.
*Added documentation to file containing implementation for UMIP.
*Improved error handling in fixup_umip_exception() function.

Changes since V6:
*Reworded and addded more details on the special cases of ModRM and SIB
 bytes. To avoid confusion, I ommited mentioning the involved registers
 (EBP and ESP).
*Replaced BUG() with printk_ratelimited in function get_reg_offset of
 insn-eval.c
*Removed unused utility functions that obtain a register value from pt_regs
 given a SIB base and index.
*Clarified nomenclature to call CS, DS, ES, FS, GS and SS segment registers
 and their values segment selectors.
*Reworked function resolve_seg_register to issue an error when more than
 one segment overrides prefixes are used in the instruction.
*Added logic in resolve_seg_register to ignore segment register when in
 long mode and not using FS or GS.
*Added logic to ensure the effective address is within the limits of the
 segment in protected mode.
*Added logic to ensure segment override prefixes are ignored when resolving
 the segment of EIP and EDI with string instructions.
*Added code to make user_64bit_mode() available in CONFIG_X86_32... and
 make it return false, of course.
*Merged the two functions that obtain the default address and operand size
 of a code segment into one as they are always used together.
*Corrected logic of displacement-only addressing in long mode to make the
 displacement relative to the RIP of the next instruction.
*Reworked logic to sign-extend 32-bit memory offsets into 64-bit signed
 memory offsets. This include more checks and putting all together in an
 utility function.
*Removed the 'unlikely' of conditional statements as we are not in a
 critical path.
*In virtual-8086 mode, ensure that effective addresses are always less
 than 0x10000,  even when address override prefixes are used. Also, ensure
 that linear addresses have a size of 20-bits.

Changes since V5:
* Relocate the page fault error code enumerations to traps.h

Changes since V4:
* Audited patches to use braces in all the branches of conditional.
  statements, except those in which the conditional action only takes one
  line.
* Implemented support in 64-builds for both 32-bit and 64-bit tasks in the
  instruction evaluating library.
* Split segment selector function in the instruction evaluating library
  into two functions to resolve the segment type by instruction override
  or default and a separate function to actually read the segment selector.
* Fixed a bug when evaluating 32-bit effective addresses with 64-bit
  kernels.
* Split patches further for for easier review.
* Use signed variables for computation of effective address.
* Fixed issue with a spurious static modifier in function insn_get_addr_ref
  found by kbuild test bot.
* Removed comparison between true and fixup_umip_exception.
* Reworked check logic when identifying erroneous vs invalid values of the
  SiB base and index.

Changes since V3:
* Limited emulation to 32-bit and 16-bit modes. For 64-bit mode, a general
  protection fault is still issued when UMIP-protected instructions are
  executed with CPL > 0.
* Expanded instruction-evaluating code to obtain segment descriptor along
  with their attributes such as base address and default address and
  operand sizes. Also, support for 16-bit encodings in protected mode was
  implemented.
* When getting a segment descriptor, this include support to obtain those
  of a local descriptor table.
* Now the instruction-evaluating code returns -EDOM when the value of
  registers should not be used in calculating the effective address. The
  value -EINVAL is left for errors.
* Incorporate the value of the segment base address in the computation of
  linear addresses.
* Renamed new instruction evaluation library from insn-kernel.c to
  insn-eval.c
* Exported functions insn_get_reg_offset_* to obtain the register offset
  by ModRM r/m, SiB base and SiB index.
* Improved documentation of functions.
* Split patches further for easier review.

Changes since V2:
* Added new utility functions to decode the memory addresses contained in
  registers when the 16-bit addressing encodings are used. This includes
  code to obtain and compute memory addresses using segment selectors for
  real-mode address translation.
* Added support to emulate UMIP-protected instructions for virtual-8086
  tasks.
* Added self-tests for virtual-8086 mode that contains representative
  use cases: address represented as a displacement, address in registers
  and registers as operands.
* Instead of maintaining a static variable for the dummy base addresses
  of the IDT and GDT, a hard-coded value is used.
* The emulated SMSW instructions now return the value with which the CR0
  register is programmed in head_32/64.S This is: PE | MP | ET | NE | WP
  | AM. For x86_64, PG is also enabled.
* The new file arch/x86/lib/insn-utils.c is now renamed as arch/x86/lib/
  insn-kernel.c. It also has its own header. This helps keep in sync the
  the kernel and objtool instruction decoders. Also, the new insn-kernel.c
  contains utility functions that are only relevant in a kernel context.
* Removed printed warnings for errors that occur when decoding instructions
  with invalid operands.
* Added more comments on fixes in the instruction-decoding MPX functions.
* Now user_64bit_mode(regs) is used instead of test_thread_flag(TIF_IA32)
  to determine if the task is 32-bit or 64-bit.
* Found and fixed a bug in insn-decoder in which X86_MODRM_RM was
  incorrectly used to obtain the mod part of the ModRM byte.
* Added more explanatory code in emulation and instruction decoding code.
  This includes a comment regarding that copy_from_user could fail if there
  exists a memory protection key in place.
* Tested code with CONFIG_X86_DECODER_SELFTEST=y and everything passes now.
* Prefixed get_reg_offset_rm with insn_ as this function is exposed
  via a header file. For clarity, this function was added in a separate
  patch.

Changes since V1:
* Virtual-8086 mode tasks are not treated in a special manner. All code
  for this purpose was removed.
* Instead of attempting to disable UMIP during a context switch or when
  entering virtual-8086 mode, UMIP remains enabled all the time. General
  protection faults that occur are fixed-up by returning dummy values as
  detailed above.
* Removed umip= kernel parameter in favor of using clearcpuid=514 to
  disable UMIP.
* Removed selftests designed to detect the absence of SIGSEGV signals when
  running in virtual-8086 mode.
* Reused code from MPX to decode instructions operands. For this purpose
  code was put in a common location.
* Fixed two bugs in MPX code that decodes operands.

Ricardo Neri (29):
  x86/mm: Relocate page fault error codes to traps.h
  x86/boot: Relocate definition of the initial state of CR0
  ptrace,x86: Make user_64bit_mode() available to 32-bit builds
  uprobes/x86: Use existing definitions for segment override prefixes
  x86/mpx: Simplify handling of errors when computing linear addresses
  x86/mpx: Use signed variables to compute effective addresses
  x86/mpx: Do not use SIB.index if its value is 100b and ModRM.mod is
    not 11b
  x86/mpx: Do not use SIB.base if its value is 101b and ModRM.mod = 0
  x86/mpx, x86/insn: Relocate insn util functions to a new insn-eval
    file
  x86/insn-eval: Do not BUG on invalid register type
  x86/insn-eval: Add a utility function to get register offsets
  x86/insn-eval: Add utility function to identify string instructions
  x86/insn-eval: Add utility functions to get segment selector
  x86/insn-eval: Add utility function to get segment descriptor
  x86/insn-eval: Add utility functions to get segment descriptor base
    address and limit
  x86/insn-eval: Add function to get default params of code segment
  x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and
    ModRM.rm is 101b
  x86/insn-eval: Incorporate segment base in linear address computation
  x86/insn-eval: Add support to resolve 32-bit address encodings
  x86/insn-eval: Add wrapper function for 32 and 64-bit addresses
  x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode
  x86/insn-eval: Add support to resolve 16-bit addressing encodings
  x86/cpufeature: Add User-Mode Instruction Prevention definitions
  x86: Add emulation code for UMIP instructions
  x86/umip: Force a page fault when unable to copy emulated result to
    user
  x86: Enable User-Mode Instruction Prevention
  x86/traps: Fixup general protection faults caused by UMIP
  selftests/x86: Add tests for User-Mode Instruction Prevention
  selftests/x86: Add tests for instruction str and sldt

 arch/x86/Kconfig                              |   10 +
 arch/x86/include/asm/cpufeatures.h            |    1 +
 arch/x86/include/asm/disabled-features.h      |    8 +-
 arch/x86/include/asm/inat.h                   |   10 +
 arch/x86/include/asm/insn-eval.h              |   23 +
 arch/x86/include/asm/ptrace.h                 |    6 +-
 arch/x86/include/asm/traps.h                  |   18 +
 arch/x86/include/asm/umip.h                   |   12 +
 arch/x86/include/uapi/asm/processor-flags.h   |    5 +
 arch/x86/kernel/Makefile                      |    1 +
 arch/x86/kernel/cpu/common.c                  |   25 +-
 arch/x86/kernel/head_32.S                     |    3 -
 arch/x86/kernel/head_64.S                     |    3 -
 arch/x86/kernel/traps.c                       |    5 +
 arch/x86/kernel/umip.c                        |  350 +++++++
 arch/x86/kernel/uprobes.c                     |   15 +-
 arch/x86/lib/Makefile                         |    2 +-
 arch/x86/lib/insn-eval.c                      | 1213 +++++++++++++++++++++++++
 arch/x86/mm/fault.c                           |   88 +-
 arch/x86/mm/mpx.c                             |  120 +--
 tools/testing/selftests/x86/entry_from_vm86.c |   89 +-
 21 files changed, 1818 insertions(+), 189 deletions(-)
 create mode 100644 arch/x86/include/asm/insn-eval.h
 create mode 100644 arch/x86/include/asm/umip.h
 create mode 100644 arch/x86/kernel/umip.c
 create mode 100644 arch/x86/lib/insn-eval.c

-- 
2.7.4