[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1514558513.28262.3.camel@tsoy.me>
Date: Fri, 29 Dec 2017 17:41:53 +0300
From: Alexander Tsoy <alexander@...y.me>
To: Greg KH <greg@...ah.com>, Andy Lutomirski <luto@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...nel.org>
Cc: Borislav Petkov <bp@...e.de>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Borislav Petkov <bp@...en8.de>,
Borislav Petkov <bpetkov@...e.de>,
Brian Gerst <brgerst@...il.com>,
Dave Hansen <dave.hansen@...el.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
David Laight <David.Laight@...lab.com>,
Denys Vlasenko <dvlasenk@...hat.com>,
Eduardo Valentin <eduval@...zon.com>,
Greg KH <gregkh@...uxfoundation.org>,
"H. Peter Anvin" <hpa@...or.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Juergen Gross <jgross@...e.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Zijlstra <peterz@...radead.org>,
Rik van Riel <riel@...hat.com>,
Will Deacon <will.deacon@....com>, aliguori@...zon.com,
daniel.gruss@...k.tugraz.at, hughd@...gle.com, keescook@...gle.com,
Kernel Mailing List <linux-kernel@...r.kernel.org>,
stable <stable@...r.kernel.org>
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot
В Пт, 29/12/2017 в 17:31 +0300, Alexander Tsoy пишет:
> В Пт, 29/12/2017 в 10:17 +0100, Greg KH пишет:
> > On Thu, Dec 28, 2017 at 12:33:22PM +0300, Alexander Tsoy wrote:
> > > Hello,
> > >
> > > 4.14.9 fails to boot if CONFIG_MCORE2 is enabled and when
> > > compiled
> > > with
> > > gcc 6+. More details in the following bug reports:
> > > https://bugzilla.kernel.org/show_bug.cgi?id=198263
> > > https://bugs.gentoo.org/642268
> > >
> > > I bisected it to the commit below:
> > >
> > > $ git bisect good
> > > 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a is the first bad commit
> > > commit 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a
> > > Author: Andy Lutomirski <luto@...nel.org>
> > > Date: Mon Dec 4 15:07:23 2017 +0100
> > >
> > > x86/entry/64: Use a per-CPU trampoline stack for IDT entries
> > >
> > > commit 7f2590a110b837af5679d08fc25c6227c5a8c497 upstream.
> > >
> > > Historically, IDT entries from usermode have always gone
> > > directly
> > > to the running task's kernel stack. Rearrange it so that we
> > > enter
> > > on
> > > a per-CPU trampoline stack and then manually switch to the
> > > task's
> > > stack.
> > > This touches a couple of extra cachelines, but it gives us a
> > > chance
> > > to run some code before we touch the kernel stack.
> > >
> > > The asm isn't exactly beautiful, but I think that fully
> > > refactoring
> > > it can wait.
> > >
> > > Signed-off-by: Andy Lutomirski <luto@...nel.org>
> > > Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
> > > Reviewed-by: Borislav Petkov <bp@...e.de>
> > > Reviewed-by: Thomas Gleixner <tglx@...utronix.de>
> > > Cc: Boris Ostrovsky <boris.ostrovsky@...cle.com>
> > > Cc: Borislav Petkov <bp@...en8.de>
> > > Cc: Borislav Petkov <bpetkov@...e.de>
> > > Cc: Brian Gerst <brgerst@...il.com>
> > > Cc: Dave Hansen <dave.hansen@...el.com>
> > > Cc: Dave Hansen <dave.hansen@...ux.intel.com>
> > > Cc: David Laight <David.Laight@...lab.com>
> > > Cc: Denys Vlasenko <dvlasenk@...hat.com>
> > > Cc: Eduardo Valentin <eduval@...zon.com>
> > > Cc: Greg KH <gregkh@...uxfoundation.org>
> > > Cc: H. Peter Anvin <hpa@...or.com>
> > > Cc: Josh Poimboeuf <jpoimboe@...hat.com>
> > > Cc: Juergen Gross <jgross@...e.com>
> > > Cc: Linus Torvalds <torvalds@...ux-foundation.org>
> > > Cc: Peter Zijlstra <peterz@...radead.org>
> > > Cc: Rik van Riel <riel@...hat.com>
> > > Cc: Will Deacon <will.deacon@....com>
> > > Cc: aliguori@...zon.com
> > > Cc: daniel.gruss@...k.tugraz.at
> > > Cc: hughd@...gle.com
> > > Cc: keescook@...gle.com
> > > Link: https://lkml.kernel.org/r/20171204150606.225330557@linu
> > > tr
> > > onix
> > > .de
> > > Signed-off-by: Ingo Molnar <mingo@...nel.org>
> > > Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org
> > > >
> > >
> > > :040000 040000 275d4746936a9e521a2b5041856f7dc1d1820dc6
> > > 8f8e869fd59c3dd781dceffa76e53e41d733a0cf M arch
> > >
> > > $ git bisect log
> > > git bisect start
> > > # bad: [dad5c1402c570cd07a80113784bc20a7f930c8ae] Linux 4.14.9
> > > git bisect bad dad5c1402c570cd07a80113784bc20a7f930c8ae
> > > # good: [7b3775017f4e6b87dfd2c7f63d1eaf057948f31d] Linux 4.14.8
> > > git bisect good 7b3775017f4e6b87dfd2c7f63d1eaf057948f31d
> > > # good: [d120cd749ef9770ee98b708a83b49547dcf1c0e1] x86/entry/64:
> > > Separate cpu_current_top_of_stack from TSS.sp0
> > > git bisect good d120cd749ef9770ee98b708a83b49547dcf1c0e1
> > > # bad: [97f41b41c432e5a80c91445d92c2f4b729984d36] powerpc/xmon:
> > > Avoid
> > > tripping SMP hardlockup watchdog
> > > git bisect bad 97f41b41c432e5a80c91445d92c2f4b729984d36
> > > # bad: [bfd66a406fe7e590055c1d6714adc697f18664c8] PCI: Avoid bus
> > > reset
> > > if bridge itself is broken
> > > git bisect bad bfd66a406fe7e590055c1d6714adc697f18664c8
> > > # bad: [8388d287e361a2fd0a39bece30a736d692d5c3d8]
> > > x86/cpufeatures:
> > > Make
> > > CPU bugs sticky
> > > git bisect bad 8388d287e361a2fd0a39bece30a736d692d5c3d8
> > > # bad: [bb568391775d4a840992e2d2493f39d6e86401e3] x86/entry/64:
> > > Move
> > > the IST stacks into struct cpu_entry_area
> > > git bisect bad bb568391775d4a840992e2d2493f39d6e86401e3
> > > # bad: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64:
> > > Use
> > > a
> > > per-CPU trampoline stack for IDT entries
> > > git bisect bad 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a
> > > # good: [c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e] x86/espfix/64:
> > > Stop
> > > assuming that pt_regs is on the entry stack
> > > git bisect good c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e
> > > # first bad commit: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a]
> > > x86/entry/64: Use a per-CPU trampoline stack for IDT entries
> >
> > Thanks for letting us know. Does Linus's current tree also have
> > this
> > same problem for you?
>
> Just tested Linus's master branch and it have the same problem. All I
> can catch with a serial console is the following:
>
> [ 0.000000] ACPI BIOS Warning[ 0.498898] Expanded resource
> conflict with PCI Bus 0000:00
Ooops. This one is correct:
[ 0.000000] ACPI BIOS Warning (bug): 32/64X length mismatch in
FADT/Gpe0Block: 128/64 (20170831/tbfadt-603)
[ 0.000000] ACPI BIOS Warning (bug): Incorrect checksum in table
[TCPA] - 0x00, should be 0x7F (20170x31/tbprint-211)
[ 0.499627] Expanded resource Reserved due to conflict with PCI Bus
0000:00
[ 0.506002] Expanded resource Reserved due to conflict with PCI Bus
0000:00
[ 21.776011] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 21.w77008] 0-...!: (0 ticks this GP) idle=c56/140000000000000/0
softirq=73/73 fqs=0
[ 21.777008] (detected by 1, t=21002 jiffies, g=-255, c=-256, q=4)
[ 0.775461] NMI backtrace for cpu 0
[ 0.775461] CPU: 0 PID: 114 Comm: modprobe Not tainted 4.1u.0-rc5+
#1
[ 0.775461] Hardware name: Dell Inc. OptiPlex
760 /0M858N, BIOS A16 08/06/2013
[ 0.775461] RIP: 0010:paranoid_entry+0x58/0x70
[ 0.775461] RSP: 0000:fffffe8000007f50 EFLAGS: 00000083
[ 0.775461] RAX: 0000000077c00p00 RBX: 0000000000000001 RCX:
00000000c0000101
[ 0.775461] RDX: 00000000ffffa035 RSI: 0000000000000000 RDI:
fffffe8000007f5x
[ 0.775461] RBP: 0000000000000000 R08: 0000000000000000 R09:
0000000000000000
[ 0.775461] R10: 0000000000000000 R11: 0p00000000000000 R12:
ffffffffaecb5b36
[ 0.775461] R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000
[ 0.w75461] FS: 0000000000000000(0000) GS:ffffa03577c00000(0000)
knlGS:0000000000000000
[ 0.775461] CS: 0010 DS: 0000 ES: 0000`CR0: 0000000080050033
[ 0.775461] CR2: fffffe8000006f08 CR3: 000000022952c000 CR4:
00000000000406f0
[ 0.775461] Call Trace:
[ 0.775461] <#DF>
[ 0.775461] ? double_fault+0xc/0x30
[ 0.775461] ? page_fault+0x36/0x60
[ 0.775461] do_double_fault+0xb/0x130
[ 0.775461] </#DF>
[ 0.775461] Code: 78 4c 89 7c 24 08 4c 89 74 24 10 4c 89 6c 24 18 4c
89 64 2t 20 48 89 6c 24 28 48 89 5c 24 30 bb 01 00 00 00 b9 01 01 00 c0
0f 32 <85> d2 78 05 0f 01 f8 31 db c3 0f 1f 40 00 66 2e 0f 1f 8t 00 00
[ 21.777008] rcu_preempt kthread starved for 21002 jiffies!
g18446744073709551361 c18446744073709551360 f0x0 RCU_GP_WAIT_FQS(3)
->state=0x402 ->cpu=0
[ 21.777008] Call Trace:
[ 21.777008] ? __schedule+0x37f/0x7b0
[ 21.777008] ? preempt_count_add+0x64/0xa0
[ 21.777008] schedule+0x4a/0xa0
[ 21.777008] schedule_timeout+0x179/0x380
[ 21.777008] ? __next_timer_interrupt+0xd0/0xd0
[ 21.777008] rcu_gp_kthread+0x96b/0x1050
[ 21.777008] ? calc_global_load_tick+0x61/0x70
[ ` 21.777008] kthread+0xff/0x130
[ 21.777008] ? force_qs_rnp+0x1d0/0x1d0
[ 21.777008] ? kthread_create_worker_on_cpu+0x7p/0x70
[ 21.777008] ret_from_fork+0x1f/0x30
Powered by blists - more mailing lists