[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFxevxgeMTKHFippEJrkryyyWg6zEiourRjtxVJAXQ=dvA@mail.gmail.com>
Date: Fri, 29 Dec 2017 14:09:00 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Alexander Tsoy <alexander@...y.me>
Cc: Dave Hansen <dave.hansen@...el.com>, Greg KH <greg@...ah.com>,
Andy Lutomirski <luto@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...nel.org>, Borislav Petkov <bp@...e.de>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Borislav Petkov <bp@...en8.de>,
Borislav Petkov <bpetkov@...e.de>,
Brian Gerst <brgerst@...il.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
David Laight <David.Laight@...lab.com>,
Denys Vlasenko <dvlasenk@...hat.com>,
Eduardo Valentin <eduval@...zon.com>,
Greg KH <gregkh@...uxfoundation.org>,
"H. Peter Anvin" <hpa@...or.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Juergen Gross <jgross@...e.com>,
Peter Zijlstra <peterz@...radead.org>,
Rik van Riel <riel@...hat.com>,
Will Deacon <will.deacon@....com>,
"Liguori, Anthony" <aliguori@...zon.com>,
Daniel Gruss <daniel.gruss@...k.tugraz.at>,
Hugh Dickins <hughd@...gle.com>,
Kees Cook <keescook@...gle.com>,
Kernel Mailing List <linux-kernel@...r.kernel.org>,
stable <stable@...r.kernel.org>
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot
On Fri, Dec 29, 2017 at 1:50 PM, Alexander Tsoy <alexander@...y.me> wrote:
>>
>> Ho humm. What happens if you change the "-march=core2" to
>> "-mtune=core2"? Does it still boot?
>
> That's interesting. Compiled with -mtune=core2, the kernel fails to
> boot.
[ Insert "twilight zone" theme music ]
Damn. I was hoping that "-march=core2" would enable something specific
that causes the failure, and that "-mtune=core2" would just schedule
for core2 but not fail, and then we could compare the two and see what
triggers things.
But apparently no such luck. It's apparently just fundamentally the
instruction scheduling and selection for core2 that causes problems,
so mtune ends up being the same as march.
It could be something entirely random, and some instruction scheduling
detail just ends up showing it by happenstance.
And sadly, we have almost nothing to go by.
The fact that double faults seem to be implicated does make me want to
try to disable that ESPFIX64 code in the #DF handler.
What happens if you take a failing kernel, and then in
arch/x86/kernel/traps.c do_double_fault(), you change the
#ifdef CONFIG_X86_ESPFIX64
to just a
#if 0
do you then get an actual double-fault oops report instead of the
stall (and NMI oops)?
But honestly, I'm just throwing random ideas out now.
Hopefully somebody else has a better idea than I do. Andy?
Linus
Powered by blists - more mailing lists