lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LSU.2.21.2001221556160.15957@pobox.suse.cz>
Date:   Wed, 22 Jan 2020 16:05:27 +0100 (CET)
From:   Miroslav Benes <mbenes@...e.cz>
To:     Josh Poimboeuf <jpoimboe@...hat.com>
cc:     Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Joe Lawrence <joe.lawrence@...hat.com>,
        Jessica Yu <jeyu@...nel.org>, x86@...nel.org,
        linux-kernel@...r.kernel.org, mhiramat@...nel.org,
        bristot@...hat.com, jbaron@...mai.com,
        torvalds@...ux-foundation.org, tglx@...utronix.de,
        mingo@...nel.org, namit@...are.com, hpa@...or.com, luto@...nel.org,
        ard.biesheuvel@...aro.org, live-patching@...r.kernel.org,
        Randy Dunlap <rdunlap@...radead.org>, mjambor@...e.cz
Subject: Re: [PATCH v3 5/6] x86/ftrace: Use text_poke()

On Wed, 22 Jan 2020, Miroslav Benes wrote:

> On Tue, 21 Jan 2020, Josh Poimboeuf wrote:
> 
> > On Tue, Jan 21, 2020 at 09:35:28AM +0100, Miroslav Benes wrote:
> > > On Mon, 20 Jan 2020, Josh Poimboeuf wrote:
> > > 
> > > > On Mon, Oct 21, 2019 at 10:05:49AM -0500, Josh Poimboeuf wrote:
> > > > > On Wed, Oct 16, 2019 at 09:42:17AM +0200, Peter Zijlstra wrote:
> > > > > > > which are not compatible with livepatching. GCC upstream now has
> > > > > > > -flive-patching option, which disables all those interfering optimizations.
> > > > > > 
> > > > > > Which, IIRC, has a significant performance impact and should thus really
> > > > > > not be used...
> > > > > > 
> > > > > > If distros ship that crap, I'm going to laugh at them the next time they
> > > > > > want a single digit performance improvement because *important*.
> > > > > 
> > > > > I have a crazy plan to try to use objtool to detect function changes at
> > > > > a binary level, which would hopefully allow us to drop this flag.
> > > > > 
> > > > > But regardless, I wonder if we enabled this flag prematurely.  We still
> > > > > don't have a reasonable way to use it for creating source-based live
> > > > > patches upstream, and it should really be optional for CONFIG_LIVEPATCH,
> > > > > since kpatch-build doesn't need it.
> > > > 
> > > > I also just discovered that -flive-patching is responsible for all those
> > > > "unreachable instruction" objtool warnings which Randy has been
> > > > dutifully bugging me about over the last several months.  For some
> > > > reason it subtly breaks GCC implicit noreturn detection for local
> > > > functions.
> > > 
> > > Ugh, that is unfortunate. Have you reported it?
> > 
> > Not yet (but I plan to).
> 
> My findings so far...
> 
> I bisected through GCC options which -flive-patching disables and 
> -fno-ipa-pure-const is the culprit. I got no warnings without the option 
> with my config.
> 
> Then I found out allmodconfig was ok even with -flive-patching. 
> CONFIG_GCOV is the difference. CONFIG_GCOV=y seems to make the warnings go 
> away here.

Sorry, that was a red herring. See 867ac9d73709 ("objtool: Fix gcov check 
for older versions of GCC").

I started looking at some btrfs reports and then found out those were 
already fixed. 
https://lore.kernel.org/linux-btrfs/cd4091e4-1c04-a880-f239-00bc053f46a2@infradead.org/

arch/x86/kernel/cpu/mce/core.o: warning: objtool: mce_panic()+0x11b: unreachable instruction
was next...

Broken code (-fno-ipa-pure-const):
...
    1186:       e8 a5 fe ff ff          callq  1030 <wait_for_panic>
    118b:       e9 23 ff ff ff          jmpq   10b3 <mce_panic+0x43>
</end of function>

Working code (-fipa-pure-const):
     753:       e8 88 fe ff ff          callq  5e0 <wait_for_panic>
     758:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
     75f:       00 

mce_panic() has:
                if (atomic_inc_return(&mce_panicked) > 1)                                                              
                        wait_for_panic();
                barrier();
                
                bust_spinlocks(1);                                                                                     

jmpq in the broken code goes to bust_spinlocks(1), because GCC does not 
know that wait_for_panic() is noreturn... because it is not. 
wait_for_panic() calls panic() unconditionally in the end, which is 
noreturn.

So the question is why ipa-pure-const optimization knows about panic()'s 
noreturn. The answer is that it is right one of the things the 
optimization does. It propagates inner noreturns to its callers. (Martin 
Jambor CCed).

Marking wait_for_panic() as noreturn (__noreturn), of course, fixes it 
then. Now I don't know what the right fix should be. Should we mark all 
these sites as noreturn, or is it ok for the kernel to rely on GCC 
behaviour in this case? Could we teach objtool to recognize this?

Miroslav

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ