lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <cover.1470907718.git.luto@kernel.org>
Date:	Thu, 11 Aug 2016 02:35:20 -0700
From:	Andy Lutomirski <luto@...nel.org>
To:	x86@...nel.org
Cc:	Borislav Petkov <bp@...en8.de>, linux-kernel@...r.kernel.org,
	Brian Gerst <brgerst@...il.com>,
	Andy Lutomirski <luto@...nel.org>
Subject: [PATCH v6 0/3] virtually mapped stacks

Hi all-

Since the dawn of time, a kernel stack overflow has been a real PITA
to debug, has caused nondeterministic crashes some time after the
actual overflow, and has generally been easy to exploit for root.

With this series, arches can enable HAVE_ARCH_VMAP_STACK.  Arches
that enable it (just x86 for now) get virtually mapped stacks with
guard pages.  This causes reliable faults when the stack overflows.

If the arch implements it well, we get a nice OOPS on stack overflow
(as opposed to panicing directly or otherwise exploding badly).  On
x86, the OOPS is nice, has a usable call trace, and the overflowing
task is killed cleanly.

This does not address interrupt stacks.

It's worth noting that s390 has an arch-specific gcc feature that
detects stack overflows by adjusting function prologues.  Arches
with features like that may wish to avoid using vmapped stacks to
avoid any possible the performance hit.

Ingo, I think it may make sense to apply this in its own branch in
-tip.  By itself, it hurts performance a bit, but the next series
(moving thread_info into task_struct and caching stacks) appears to
mostly eliminate the slowdown and to actually speed up my benchmark
compared to the status quo.  I'm sending this separately because it's
logically separate and this may ease the cognitive load a bit.

The 0day bot is chewing on this as we speak.

Known issues:
- virtio_console and wusb will have issues.  Michael
  Tsirkin says he'll fix virtio_console.  Herbert Xu or I will fix wusb,
  although I'm not convinced that wusb hardware exists.

Changes from v5:
 - Rebase to 4.8-rc1
 - Separate out the vmapped stack bit, which can be applied by itself
   now that most of the preparatory work has landed in 4.8-rc1.
 - Default to Y (Ingo)

Changes from v4:
 - Fix kthread (Oleg)
 - Tidy up some changelongs and fold some patches (Borislav, Josh)
 - Add "x86/mm/64: In vmalloc_fault(), use CR3 instead of current->active_mm"
 - Make VMAP_STACK depend on !KASAN (not worth waiting for the fix, I think)

Changes from v3:
 - Minor cleanups
 - Rebased onto Linus' tree
 - All the thread_info stuff is new

Changes from v2:
 - Delete kerne_unmap_pages_in_pgd rather than hardening it (Borislav)
 - Fix sub-page stack accounting better (Josh)

Changes from v1:
 - Fix rewind_stack_and_do_exit (Josh)
 - Fix deadlock under load
 - Clean up generic stack vmalloc code
 - Many other minor fixes

Andy Lutomirski (3):
  fork: Add generic vmalloced stack support
  dma-api: Teach the "DMA-from-stack" check about vmapped stacks
  x86/mm/64: Enable vmapped stacks

 arch/Kconfig                        | 34 +++++++++++++
 arch/ia64/include/asm/thread_info.h |  2 +-
 arch/x86/Kconfig                    |  1 +
 arch/x86/include/asm/switch_to.h    | 28 ++++++++++-
 arch/x86/kernel/traps.c             | 62 ++++++++++++++++++++++++
 arch/x86/mm/tlb.c                   | 15 ++++++
 include/linux/sched.h               | 15 ++++++
 kernel/fork.c                       | 96 +++++++++++++++++++++++++++++--------
 lib/dma-debug.c                     | 39 ++++++++++++---
 9 files changed, 264 insertions(+), 28 deletions(-)

-- 
2.7.4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ