lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 22 Aug 2016 15:27:19 -0500
From:   Josh Poimboeuf <jpoimboe@...hat.com>
To:     Kees Cook <keescook@...omium.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>,
        "H . Peter Anvin" <hpa@...or.com>,
        "x86@...nel.org" <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Andy Lutomirski <luto@...capital.net>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Brian Gerst <brgerst@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Frederic Weisbecker <fweisbec@...il.com>,
        Byungchul Park <byungchul.park@....com>,
        Nilay Vaish <nilayvaish@...il.com>
Subject: Re: [PATCH v4 54/57] x86/mm: convert arch_within_stack_frames() to
 use the new unwinder

On Fri, Aug 19, 2016 at 04:55:22PM -0500, Josh Poimboeuf wrote:
> On Fri, Aug 19, 2016 at 11:27:18AM -0700, Kees Cook wrote:
> > On Thu, Aug 18, 2016 at 6:06 AM, Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> > > Convert arch_within_stack_frames() to use the new unwinder.
> > >
> > > This also changes some existing behavior:
> > >
> > > - Skip checking of pt_regs frames.
> > > - Warn if it can't reach the grandparent's stack frame.
> > > - Warn if it doesn't unwind to the end of the stack.
> > >
> > > Signed-off-by: Josh Poimboeuf <jpoimboe@...hat.com>
> > 
> > All the stuff touching usercopy looks good to me. One question,
> > though, in looking through the unwinder. It seems like it's much more
> > complex than just the frame-hopping that the old
> > arch_within_stack_frames() did, but I'm curious to hear what you think
> > about its performance. We'll be calling this with every usercopy that
> > touches the stack, so I'd like to be able to estimate the performance
> > impact of this replacement...
> 
> Yeah, good point.  I'll take some measurements from before and after and
> get back to you.

I took some before/after measurements by enclosing the affected
functions with ktime calls to get the total time spent in each function,
and did a "find /usr >/dev/null" to trigger a bunch of user copies.

	copy_to/from_user	check_object_size	arch_within_stack_frames
before: 13ms			6.8ms			0.61ms
after: 	17ms			11ms			4.6ms

The unwinder port made arch_within_stack_frames() *much* (8x) slower
than its current simple implementation, and added about 30% (4ms) to the
total copy_to/from_user() run time.

Note that hardened usercopy itself is already quite slow: it made user
copies about 52% slower.  With the unwinder port, that worsened to ~65%.

"find /usr" took about 170ms of kernel time and 2.3s total.  So the
unwinder port added about 2% on the kernel side and 0.2% total for this
particular test case.  Though I'm sure there are more I/O-intensive
workloads out there which would be more adversely affected.

I haven't yet looked to see where the bottlenecks are and if there could
be any obvious performance improvements.

BTW, ignoring the performance issues, using the unwinder here would have
some benefits:

- It protects pt_regs frames from being changed.  For example, during a
  page fault operation, the saved regs->ip on the stack is protected.

- Unlike the existing code, it could potentially work with
  __copy_from_user_inatomic() and copy_from_user_nmi(), which can copy
  to/from an irq/exception stack.  (I think check_stack_object() would
  need to be rewritten a bit so that it doesn't always assume the task
  stack.)

- It complains loudly if there's stack corruption or something else goes
  wrong with walking the stack instead of just silently failing.

- The same code could also work with DWARF if we ever add a DWARF
  unwinder (with a possible tweak to the unwinder API to get the stack
  frame header size).

-- 
Josh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ