[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LSU.2.21.2004141343370.11688@wotan.suse.de>
Date: Tue, 14 Apr 2020 13:50:29 +0000 (UTC)
From: Michael Matz <matz@...e.de>
To: Borislav Petkov <bp@...en8.de>
cc: Jakub Jelinek <jakub@...hat.com>,
Sergei Trofimovich <slyfox@...too.org>,
linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>,
Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>, x86@...nel.org
Subject: Re: [PATCH v2] x86: fix early boot crash on gcc-10
Hello,
On Mon, 13 Apr 2020, Borislav Petkov wrote:
> > @@ -207,8 +207,11 @@ static int cpu0_logical_apicid;
> > static int enable_start_cpu0;
> > /*
> > * Activate a secondary processor.
> > + *
> > + * Note: 'boot_init_stack_canary' changes canary value. Omit
> > + * stack protection to avoid canary check (and boot) failure.
> > */
> > -static void notrace start_secondary(void *unused)
> > +static void __no_stack_protector notrace start_secondary(void *unused)
>
> Hmm, so we did this per-function marking only but that explodes on
> 32-bit, see splat at the end. gcc guys, any ideas?
>
> The null pointer deref happens this way:
>
> The __no_stack_protector annotated function start_secondary() calls
> trace_hardirqs_on(). On entry, that function pushes the frame pointer on
> the stack:
>
> trace_hardirqs_on:
> pushl %ebp #
> movl %esp, %ebp #,
> subl $20, %esp #,
> movl %ebx, -12(%ebp) #,
> movl %esi, -8(%ebp) #,
> movl %edi, -4(%ebp) #,
>
>
> Singlestepping the whole thing in gdb looks like this:
>
> Dump of assembler code from 0xc1158610 to 0xc1158624:
> => 0xc1158610 <trace_hardirqs_on+0>: 55 push %ebp <---
> 0xc1158611 <trace_hardirqs_on+1>: 89 e5 mov %esp,%ebp
>
> and ebp has:
>
> ...
> ebp 0x0 0x0 <---
> esi 0x200002 2097154
> edi 0x1 1
> eip 0xc1158610
> ...
>
> Later in the function, it will do __builtin_return_address(n), which
> turns into:
>
> # kernel/trace/trace_preemptirq.c:26: trace_irq_enable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
> movl 0(%ebp), %eax #, tmp133
> # kernel/trace/trace_preemptirq.c:27: tracer_hardirqs_on(CALLER_ADDR0, CALLER_ADDR1);
> movl 4(%eax), %edx #, tmp130
So this part expects that the caller (!) of trace_hardirqs_on was compiled
with a frame pointer (in %ebp). Obviously that's not the case as you
traced above. Is start_secondary the immediate caller in the above
case?
> <--- derefs it here. Boom.
>
> So, could it be that marking this one function like this:
>
> static void __attribute__((optimize("-fno-stack-protector"))) __attribute__((no_instrument_function)) start_secondary(void *unused)
> {
>
> would cause %ebp to be 0 for whatever reason on 32-bit?
Look at it's disassembly. If it doesn't have the usual push
%ebp/mov%esp,%ebp prologue it probably doesn't use a frame pointer. In
that case I would speculate that having a frame pointer was (before the
change above) only a side-effect of being compiled with -fstack-protector,
which got now disabled. But I was under the impression that the upstream
kernels build with -fno-omit-frame-pointer, so that sounds unexpected.
But I have no better explanation at the moment. If the above speculation
doesn't make you progress: preprocessed file and a note of what the
immediate caller of trace_hardirqs_on is in the above case, please.
Ciao,
Michael.
Powered by blists - more mailing lists