linux-kernel - Re: [PATCH 25/43] x86/mm/kaiser: Unmap kernel from userspace page tables (core patch)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171126185152.ttgrdrbipzojyy6p@pd.tnic>
Date:   Sun, 26 Nov 2017 19:51:52 +0100
From:   Borislav Petkov <bp@...en8.de>
To:     Ingo Molnar <mingo@...nel.org>
Cc:     linux-kernel@...r.kernel.org,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Andy Lutomirski <luto@...capital.net>,
        Thomas Gleixner <tglx@...utronix.de>,
        "H . Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH 25/43] x86/mm/kaiser: Unmap kernel from userspace page
 tables (core patch)

On Fri, Nov 24, 2017 at 06:23:53PM +0100, Ingo Molnar wrote:
> diff --git a/Documentation/x86/kaiser.txt b/Documentation/x86/kaiser.txt
> new file mode 100644

Here some text cleanups/typos fixes on top, after reading through it:

---
diff --git a/Documentation/x86/kaiser.txt b/Documentation/x86/kaiser.txt
index 46efa3662e22..f2df0441f6ea 100644
--- a/Documentation/x86/kaiser.txt
+++ b/Documentation/x86/kaiser.txt
@@ -4,9 +4,10 @@ Overview
 KAISER is a countermeasure against attacks on kernel address
 information.  There are at least three existing, published,
 approaches using the shared user/kernel mapping and hardware features
-to defeat KASLR.  One approach referenced in the paper locates the
-kernel by observing differences in page fault timing between
-present-but-inaccessable kernel pages and non-present pages.
+to defeat KASLR.  One approach referenced in the paper
+(https://gruss.cc/files/kaiser.pdf) locates the kernel by
+observing differences in page fault timing between
+present-but-inaccessible kernel pages and non-present pages.
 
 When the kernel is entered via syscalls, interrupts or exceptions,
 page tables are switched to the full "kernel" copy.  When the
@@ -18,37 +19,36 @@ entry/exit functions themselves and the interrupt descriptor
 table (IDT).
 
 This helps to ensure that side-channel attacks that leverage the
-paging structures do not function when KAISER is enabled.  It can be
-enabled by setting CONFIG_KAISER=y
+paging structures do not function when KAISER is enabled, by setting
+CONFIG_KAISER=y.
 
 Page Table Management
 =====================
 
 When KAISER is enabled, the kernel manages two sets of page
 tables.  The first copy is very similar to what would be present
-for a kernel without KAISER.  This includes a complete mapping of
-userspace that the kernel can use for things like copy_to_user().
+for a kernel without KAISER.  It includes a complete mapping of
+userspace that the kernel needs for things like copy_*_user().
 
 The second (shadow) is used when running userspace and mirrors the
-mapping of userspace present in the kernel copy.  It maps a only
+mapping of userspace present in the kernel copy.  It maps only
 the kernel data needed to enter and exit the kernel.
 
 The shadow is populated by the kaiser_add_*() functions.  Only
-kernel data which has been explicity mapped will appear in the
-shadow copy.  These calls are rare at runtime.
+kernel data which has been explicitly mapped will appear in the
+shadow copy. These calls are rare at runtime.
 
 For a new userspace mapping, the kernel makes the entries in its
 page tables like normal.  The only difference is when the kernel
 makes entries in the top (PGD) level.  In addition to setting the
-entry in the main kernel PGD, a copy if the entry is made in the
+entry in the main kernel PGD, a copy of the entry is made in the
 shadow PGD.
 
 For user space mappings the kernel creates an entry in the kernel
 PGD and the same entry in the shadow PGD, so the underlying page
-table to which the PGD entry points is shared down to the PTE
+table to which the PGD entry points to, is shared down to the PTE
 level.  This leaves a single, shared set of userspace page tables
-to manage.  One PTE to lock, one set set of accessed bits, dirty
-bits, etc...
+to manage.  One PTE to lock, one set of accessed, dirty bits, etc...
 
 Overhead
 ========
@@ -76,8 +76,8 @@ this protection comes at a cost:
   a. CR3 manipulation to switch between the page table copies
      must be done at interrupt, syscall, and exception entry
      and exit (it can be skipped when the kernel is interrupted,
-     though.)  Moves to CR3 are on the order of a hundred
-     cycles, and are required every at entry and every at exit.
+     though.)  CR3 modifications are in the order of a hundred
+     cycles, and are required at every entry and exit.
   b. Task stacks must be mapped/unmapped.  We need to walk
      and modify the shadow page tables at fork() and exit().
   c. Global pages are disabled.  This feature of the MMU
@@ -91,7 +91,7 @@ this protection comes at a cost:
      systems with PCID support, the context switch code must flush
      both the user and kernel entries out of the TLB, with an
      INVPCID in addition to the CR3 write.  This INVPCID is
-     generally slower than a CR3 write, but still on the order of
+     generally slower than a CR3 write, but still in the order of
      a hundred cycles.
   e. The shadow page tables must be populated for each new
      process.  Even without KAISER, the shared kernel mappings
@@ -123,7 +123,7 @@ Possible Future Work:
    code or userspace since it will not have to reload all of
    its TLB entries.  However, its upside is limited by PCID
    being used.
-4. Allow KAISER to enabled/disabled at runtime so folks can
+4. Allow KAISER to be enabled/disabled at runtime so folks can
    run a single kernel image.
 
 Debugging:
@@ -144,7 +144,7 @@ that are worth noting here.
    running perf.
  * Kernel crashes at the first exit to userspace.  entry_64.S
    bugs, or failing to map some of the exit code.
- * Crashes at first interrupt that interrupts userspace. The paths
+ * Crashes at the first interrupt that interrupts userspace. The paths
    in entry_64.S that return to userspace are sometimes separate
    from the ones that return to the kernel.
  * Double faults: overflowing the kernel stack because of page
@@ -157,4 +157,3 @@ that are worth noting here.
    as mount(8) failing to mount the rootfs.  These have
    tended to be TLB invalidation issues.  Usually invalidating
    the wrong PCID, or otherwise missing an invalidation.
-

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.