lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150820040050.GC2944@treble.redhat.com>
Date:	Wed, 19 Aug 2015 23:00:50 -0500
From:	Josh Poimboeuf <jpoimboe@...hat.com>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
	linux-kernel@...r.kernel.org, Michal Marek <mmarek@...e.cz>,
	Peter Zijlstra <peterz@...radead.org>,
	Andy Lutomirski <luto@...nel.org>,
	Borislav Petkov <bp@...en8.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andi Kleen <andi@...stfloor.org>,
	Pedro Alves <palves@...hat.com>,
	Namhyung Kim <namhyung@...il.com>,
	Bernd Petrovitsch <bernd@...rovitsch.priv.at>,
	Chris J Arges <chris.j.arges@...onical.com>,
	live-patching@...r.kernel.org
Subject: Re: [PATCH v10 03/20] x86/stackvalidate: Compile-time stack
 validation

On Wed, Aug 19, 2015 at 12:01:38PM +0200, Ingo Molnar wrote:
> 
> * Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> 
> > On Sat, Aug 15, 2015 at 12:23:54AM -0700, Andrew Morton wrote:
> > > On Thu, 13 Aug 2015 22:10:24 -0500 Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> > > 
> > > > This adds a CONFIG_STACK_VALIDATION option which enables a host tool
> > > > named stackvalidate which runs at compile time.
> > > 
> > > It would be useful to
> > > 
> > > - show example output for a typical error site
> > > 
> > > - describe the consequences of that error (ie: why should we fix
> > >   these things?)
> > > 
> > > - describe what needs to be done to repair these sites.
> > > 
> > > 
> > > IOW, what do we gain from merging all this stuff?
> > 
> > I attempted to do all that in Documentation/stack-validation.txt which
> > is in patch 03/20.  Does it address your concerns?  Here it is:
> 
> I think it answers the first and third question, but not the second one:
> 
>     - describe the consequences of that error (ie: why should we fix
>       these things?)
> 
> would be nice to document all that richly.

Ok.  I've tried to answer the "why" question from both a broad
perspective ("why do we need stackvalidate?") as well as for each of the
rules that it enforces.

---8<---

Subject: [PATCH] stackvalidate: Document why it's needed

Signed-off-by: Josh Poimboeuf <jpoimboe@...hat.com>
---
 Documentation/stack-validation.txt | 122 +++++++++++++++++++++++++++++++++++--
 1 file changed, 118 insertions(+), 4 deletions(-)

diff --git a/Documentation/stack-validation.txt b/Documentation/stack-validation.txt
index c3d3d35..94dad40 100644
--- a/Documentation/stack-validation.txt
+++ b/Documentation/stack-validation.txt
@@ -24,6 +24,101 @@ instructions).  Similarly, it knows how to follow switch statements, for
 which gcc sometimes uses jump tables.
 
 
+Why do we need stack validation?
+--------------------------------
+
+Here are some of the benefits of validating stack metadata:
+
+a) More reliable stack traces for frame pointer enabled kernels
+
+   Frame pointers are used for debugging purposes.  They allow runtime
+   code and debug tools to be able to walk the stack to determine the
+   chain of function call sites that led to the currently executing
+   code.
+
+   For some architectures, frame pointers are enabled by
+   CONFIG_FRAME_POINTER.  For some other architectures they may be
+   required by the ABI (sometimes referred to as "backchain pointers").
+
+   For C code, gcc automatically generates instructions for setting up
+   frame pointers when the -fno-omit-frame-pointer option is used.
+
+   But for asm code, the frame setup instructions have to be written by
+   hand, which most people don't do.  So the end result is that
+   CONFIG_FRAME_POINTER is honored for C code but not for most asm code.
+
+   For stack traces based on frame pointers to be reliable, all
+   functions which call other functions must first create a stack frame
+   and update the frame pointer.  If a first function doesn't properly
+   create a stack frame before calling a second function, the *caller*
+   of the first function will be skipped on the stack trace.
+
+   The benefit of stackvalidate here is that it ensures that *all*
+   functions honor CONFIG_FRAME_POINTER.  As a result, no functions will
+   ever [*] be skipped on a stack trace.
+
+   [*] unless an interrupt or exception has occurred at the very
+       beginning of a function before the stack frame has been created,
+       or at the very end of the function after the stack frame has been
+       destroyed.  This is an inherent limitation of frame pointers.
+
+b) 100% reliable stack traces for DWARF enabled kernels
+
+   (NOTE: This is not yet implemented)
+
+   As an alternative to frame pointers, DWARF Call Frame Information
+   (CFI) metadata can be used to walk the stack.  Unlike frame pointers,
+   CFI metadata is out of band.  So it doesn't affect runtime
+   performance and it can be reliable even when interrupts or exceptions
+   are involved.
+
+   For C code, gcc automatically generates DWARF CFI metadata.  But for
+   asm code, generating CFI is a tedious manual approach which requires
+   manually placed .cfi assembler macros to be scattered throughout the
+   code.  It's clumsy and very easy to get wrong, and it makes the real
+   code harder to read.
+
+   Stackvalidate will improve this situation in several ways.  For code
+   which already has CFI annotations, it will validate them.  For code
+   which doesn't have CFI annotations, it will generate them.  So an
+   architecture can opt to strip out all the manual .cfi annotations
+   from their asm code and have stackvalidate generate them instead.
+
+   We might also add a runtime stack validation debug option where we
+   periodically walk the stack from schedule() and/or an NMI to ensure
+   that the stack metadata is sane and that we reach the bottom of the
+   stack.
+
+   So the benefit of stackvalidate here will be that external tooling
+   should always show perfect stack traces.  And the same will be true
+   for kernel warning/oops traces if the architecture has a runtime
+   DWARF unwinder.
+
+c) Higher live patching compatibility rate
+
+   (NOTE: This is not yet implemented)
+
+   Currently with CONFIG_LIVEPATCH there's a basic live patching
+   framework which is safe for roughly 85-90% of "security" fixes.  But
+   patches can't have complex features like function dependency or
+   prototype changes, or data structure changes.
+
+   There's a strong need to support patches which have the more complex
+   features so that the patch compatibility rate for security fixes can
+   eventually approach something resembling 100%.  To achieve that, a
+   "consistency model" is needed, which allows tasks to be safely
+   transitioned from an unpatched state to a patched state.
+
+   One of the key requirements of the currently proposed livepatch
+   consistency model [*] is that it needs to walk the stack of each
+   sleeping task to determine if it can be transitioned to the patched
+   state.  If stackvalidate can ensure that stack traces are reliable,
+   this consistency model can be used and the live patching
+   compatibility rate can be improved significantly.
+
+   [*] https://lkml.kernel.org/r/cover.1423499826.git.jpoimboe@redhat.com
+
+
 Rules
 -----
 
@@ -35,15 +130,26 @@ To achieve the validation, stackvalidate enforces the following rules:
    outside of a function, it flags an error since that usually indicates
    callable code which should be annotated accordingly.
 
+   This rule is needed so that stackvalidate can properly identify each
+   callable function in order to analyze its stack metadata.
+
 2. Conversely, each section of code which is *not* callable should *not*
    be annotated as an ELF function.  The ENDPROC macro shouldn't be used
    in this case.
 
+   This rule is needed so that stackvalidate can ignore non-callable
+   code.  Such code doesn't have to follow any of the other rules.
+
 3. Each callable function which calls another function must have the
    correct frame pointer logic, if required by CONFIG_FRAME_POINTER or
    the architecture's back chain rules.  This can by done in asm code
    with the FRAME_BEGIN/FRAME_END macros.
 
+   This rule ensures that frame pointer based stack traces will work as
+   designed.  If function A doesn't create a stack frame before calling
+   function B, the _caller_ of function A will be skipped on the stack
+   trace.
+
 4. Dynamic jumps and jumps to undefined symbols are only allowed if:
 
    a) the jump is part of a switch statement; or
@@ -51,10 +157,18 @@ To achieve the validation, stackvalidate enforces the following rules:
    b) the jump matches sibling call semantics and the frame pointer has
       the same value it had on function entry.
 
+   This rule is needed so that stackvalidate can reliably analyze all of
+   a function's code paths.  If a function jumps to code in another
+   file, and it's not a sibling call, stackvalidate has no way to follow
+   the jump because it only analyzes a single file at a time.
+
 5. A callable function may not execute kernel entry/exit instructions.
    The only code which needs such instructions is kernel entry code,
    which shouldn't be be in callable functions anyway.
 
+   This rule is just a sanity check to ensure that callable functions
+   return normally.
+
 
 Errors in .S files
 ------------------
@@ -63,8 +177,8 @@ If you're getting an error in a compiled .S file which you don't
 understand, first make sure that the affected code follows the above
 rules.
 
-Here are some examples of common problems and suggestions for how to fix
-them.
+Here are some examples of common warnings reported by stackvalidate,
+what they mean, and suggestions for how to fix them.
 
 
 1. stackvalidate: asm_file.o: func()+0x128: call without frame pointer save/setup
@@ -73,8 +187,8 @@ them.
    updating the frame pointer.
 
    If func() is indeed a callable function, add proper frame pointer
-   logic using the FP_SAVE and FP_RESTORE macros.  Otherwise, remove its
-   ELF function annotation by changing ENDPROC to END.
+   logic using the FRAME_BEGIN and FRAME_END macros.  Otherwise, remove
+   its ELF function annotation by changing ENDPROC to END.
 
    If you're getting this error in a .c file, see the "Errors in .c
    files" section.
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ