linux-kernel - Re: [PATCH v3 04/29] x86: assembly, use ENDPROC for functions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4a63c996-6c86-c298-dd9c-34b77afc6f27@suse.cz>
Date:   Wed, 17 May 2017 15:23:46 +0200
From:   Jiri Slaby <jslaby@...e.cz>
To:     Josh Poimboeuf <jpoimboe@...hat.com>
Cc:     mingo@...hat.com, tglx@...utronix.de, hpa@...or.com,
        x86@...nel.org, linux-kernel@...r.kernel.org,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Juergen Gross <jgross@...e.com>, xen-devel@...ts.xenproject.org
Subject: Re: [PATCH v3 04/29] x86: assembly, use ENDPROC for functions

On 05/13/2017, 12:15 AM, Josh Poimboeuf wrote:
>> Similarly, I have OBJTOOL(START_FUNC) and OBJTOOL(END_FUNC) emitted with
>> each FUNC_START/FUNC_END. So far, when manually expanded for simplicity,
>> it looks like this:
> 
> I like the idea of making objtool smart enough to read the entry code,
> and of combining automated annotations (where possible) with manual
> annotations (where necessary).  And it does make sense for objtool to
> automate every rsp-related push/pop/sub/add annotation.  That will make
> the entry code quite a bit cleaner since we don't need 'push_cfi' and
> friends anymore.
> 
> However, I think trying to force the entry code snippets into being
> normal functions would be awkward.  For example, C-type functions all
> start off with the following initial CFI state:
> 
>      LOC           CFA      ra
>   0000000000000000 rsp+8    c-8
> 
> That means the previous frame's stack pointer was at rsp+8 and the
> return instruction pointer is (rsp).  But those assumptions don't hold
> for non-C-type functions, which usually start with pt_regs or iret regs
> on the stack, or a blank slate.
> 
> So the initial CFI state is different between the two types of
> "functions".  And there are a lot of other differences.  C-type
> functions have to follow frame pointer conventions, for example.  So
> your FUNC_START macro (and objtool) would have to somehow figure out a
> way to make a distinction between the two.  So it would probably work
> out better if we kept the distinction between C-type functions and other
> code.

Ok, that makes a lot of sense.

> I think ENDPROC (or FUNC_START/FUNC_END) should mean "this function is
> 100% standardized to the C ABI and its debuginfo can be completely
> automated".  And any code outside of that would be "this code is special
> and needs a mix of automated and manual debuginfo annotations."

I only hesitate how to call the others. I assume, SYM_FUNC_START and
SYM_FUNC_END were agreed upon for the C-func-like functions.

For the others, what about simply:
  SYM_FUNC_START_SPECIAL/SYM_FUNC_END_SPECIAL
or
  SYM_CODE_START/SYM_CODE_END
or
  SOMETHING_ELSE
?

> I'm also not sure we need the objtool-specific macros.  It might be
> simpler to have macros which just output the cfi instead.  I guess this
> goes back to our previous discussions about whether objtool's CFI access
> should be read/write or write-only.  I don't remember, did we ever to
> come to a conclusion with that?

Correct, exactly to avoid r-w on dwarfinfo in objtool, I introduced the
special objtool macros. They would just put the same cfis into the
.discard section for objtool to combine them with the automatic injected
annotations and put them to the correct place. For -- almost -- free.

Our last discussion on this topic ended up with w-only for objtool at
the moment. I originally wanted r-w to support inline assembly in C, but
you suggested r-only is quite easier, therefore we should start with it.
So the r-w extension is doable, but the question is whether we want the
complexity now.

> Either way, from looking at the entry code, we may be able to get away
> with only the following .macros:
> 
> - DWARF_EMPTY_FRAME signal=0
> 
>   Mark all registers as undefined and potentially mark the frame as a
>   signal frame.
> 
> - DWARF_SET_CFA base=rsp offset=0 c_regs=0 extra_regs=0 iret_regs=0
> 
>   Set the CFA value.  Set c_regs, extra_regs, and/or iret_regs to
>   indicate which regs (if any) are stored just below the CFA.
> 
> - DWARF_SET_INDIRECT_CFA base=rsp offset=0 val_offset=0
> 
>   Set CFA = *(base + offset) + val_offset.  I only saw a few places
>   where this is needed, where it switches to the irq stack.  We might be
>   able to figure out a way to simplify the code in a non-intrusive way
>   to get rid of the need for this one.

Correct, it corresponds with what I had locally to make DWARF unwinder
working through interrupts, in terms of CFI's:
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -463,6 +463,7 @@ SYM_FUNC_END(irq_entries_start)
        ALLOC_PT_GPREGS_ON_STACK
        SAVE_C_REGS
        SAVE_EXTRA_REGS
+       DW_CFI(.cfi_rel_offset rbp, RBP+8)
        ENCODE_FRAME_POINTER

        testb   $3, CS(%rsp)
@@ -497,7 +498,17 @@ SYM_FUNC_END(irq_entries_start)
        movq    %rsp, %rdi
        incl    PER_CPU_VAR(irq_count)
        cmovzq  PER_CPU_VAR(irq_stack_ptr), %rsp
+       DW_CFI(.cfi_def_cfa_register rdi)
+
        pushq   %rdi
+       DW_CFI(.cfi_escape 0x0f /* DW_CFA_def_cfa_expression */, 6 /*
block len */, \
+               0x77 /* DW_OP_breg7 (rsp) */, 0 /* offset */, \
+               0x06 /* DW_OP_deref */, \
+               0x08 /* DW_OP_const1u */, SIZEOF_PTREGS, \
+               0x22 /* DW_OP_plus */)
+       DW_CFI(.cfi_offset rsp, -2*8)
+       DW_CFI(.cfi_offset rip, -5*8)
+
        /* We entered an interrupt context - irqs are off: */
        TRACE_IRQS_OFF

@@ -654,9 +665,15 @@ SYM_FUNC_END(common_interrupt)
  * APIC interrupts.
  */
 .macro apicinterrupt3 num sym do_sym
-SYM_FUNC_START(\sym)
+SYM_FUNC_START_ALIAS(\sym)
+       DW_CFI(.cfi_startproc simple)
+       DW_CFI(.cfi_signal_frame)
+       DW_CFI(.cfi_def_cfa rsp, 6*8)
+       DW_CFI(.cfi_rel_offset rsp, 4*8)
+       DW_CFI(.cfi_rel_offset rip, 1*8)
        ASM_CLAC
        pushq   $~(\num)
+       DW_CFI(.cfi_adjust_cfa_offset 8)


(DW_CFI is my local-only macro to kill the annotations by a single
switch whenever I want.)

> And we could create higher-level macros from these primitives if needed.
> 
> I think we'd only need the macros in relatively few places in the entry
> code.  It would be a lot less intrusive than what we had before.

Sure, that's the whole point of this exercise :).

thanks,
-- 
js
suse labs