[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <219224e6-71f5-3209-09d5-9863a0b6fd4a@amd.com>
Date: Wed, 6 Dec 2017 18:58:41 +0100
From: Christian König <christian.koenig@....com>
To: Ingo Molnar <mingo@...nel.org>, Bjorn Helgaas <helgaas@...nel.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
Andy Shevchenko <andy.shevchenko@...il.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [regression] PCI early boot hang on certain AMD systems
Hi Ingo,
known issue with multi socket systems and the patch in question.
The attached set of patches should fix the issue and are already send to
Bjorn for inclusion in the next rc.
Sorry for the noise,
Christian.
Am 06.12.2017 um 17:16 schrieb Ingo Molnar:
> Hi,
>
> * Bjorn Helgaas <helgaas@...nel.org> wrote:
>
>> PCI changes:
>> Christian König (4):
>> x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)
> In v4.15 one of my test systems broke, it hangs in early bootup, during early PCI
> setup:
>
> [ 2.262005] pci 0000:00:18.1: adding root bus resource [mem 0x1027000000-0xfcffffffff 64bit pref window] <--- new resource
> [ 2.270081] pci 0000:00:18.2: [1022:1602] type 00 class 0x060000
> [ 2.271081] pci 0000:00:18.3: [1022:1603] type 00 class 0x060000
> [ 2.272083] pci 0000:00:18.4: [1022:1604] type 00 class 0x060000
> [ 2.273079] pci 0000:00:18.5: [1022:1605] type 00 class 0x060000
> [ 2.274083] pci 0000:00:19.0: [1022:1600] type 00 class 0x060000
> [ 2.275089] pci 0000:00:19.1: [1022:1601] type 00 class 0x060000
> [ hard hang ]
>
> I have bisected the hang to:
>
> fa564ad96366: x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)
>
> Reverting the commit makes the system boot again. The 'new resource' line above is
> I believe the new BAR added by the commit.
>
> I've attached the earlyprintk boot log of the hang, with a few printks added to
> pci_amd_enable_64bit_bar() of the relevant fields:
>
> + printk("res->start: %016llx\n", res->start);
> + printk("res->end: %016llx\n", res->end);
> + printk("base: %08x\n", base);
> + printk("high: %08x\n", high);
> + printk("limit: %08x\n", limit);
> + printk("slot: %d\n", i);
>
> [ 2.261090] pci 0000:00:18.1: [1022:1601] type 00 class 0x060000
> [ 2.262005] pci 0000:00:18.1: adding root bus resource [mem 0x1027000000-0xfcffffffff 64bit pref window]
> [ 2.264001] res->start: 0000001027000000
> [ 2.265001] res->end: 000000fcffffffff
> [ 2.266001] base: 10270003
> [ 2.267001] high: 00000000
> [ 2.268001] limit: fd000000
> [ 2.269001] slot: 1
> [ 2.270081] pci 0000:00:18.2: [1022:1602] type 00 class 0x060000
> [ 2.271081] pci 0000:00:18.3: [1022:1603] type 00 class 0x060000
> [ 2.272083] pci 0000:00:18.4: [1022:1604] type 00 class 0x060000
> [ 2.273079] pci 0000:00:18.5: [1022:1605] type 00 class 0x060000
> [ 2.274083] pci 0000:00:19.0: [1022:1600] type 00 class 0x060000
> [ 2.275089] pci 0000:00:19.1: [1022:1601] type 00 class 0x060000
>
> On a sucessful bootup the system would continue with:
>
> [ 0.583060] pci 0000:00:19.2: [1022:1602] type 00 class 0x060000
> [ 0.584079] pci 0000:00:19.3: [1022:1603] type 00 class 0x060000
> [ 0.585084] pci 0000:00:19.4: [1022:1604] type 00 class 0x060000
> [ 0.586079] pci 0000:00:19.5: [1022:1605] type 00 class 0x060000
> [ 0.588039] pci 0000:00:1a.0: [1022:1600] type 00 class 0x060000
> [ 0.589090] pci 0000:00:1a.1: [1022:1601] type 00 class 0x060000
> [ 0.590079] pci 0000:00:1a.2: [1022:1602] type 00 class 0x060000
> [ 0.591080] pci 0000:00:1a.3: [1022:1603] type 00 class 0x060000
> [ 0.593006] pci 0000:00:1a.4: [1022:1604] type 00 class 0x060000
> [ 0.594079] pci 0000:00:1a.5: [1022:1605] type 00 class 0x060000
> [ 0.595082] pci 0000:00:1b.0: [1022:1600] type 00 class 0x060000
> [ 0.596087] pci 0000:00:1b.1: [1022:1601] type 00 class 0x060000
> [ 0.597083] pci 0000:00:1b.2: [1022:1602] type 00 class 0x060000
> [ 0.598080] pci 0000:00:1b.3: [1022:1603] type 00 class 0x060000
> [ 0.599085] pci 0000:00:1b.4: [1022:1604] type 00 class 0x060000
> [ 0.600079] pci 0000:00:1b.5: [1022:1605] type 00 class 0x060000
> [ 0.601124] pci 0000:03:00.0: [1000:0072] type 00 class 0x010700
> [ 0.602037] pci 0000:03:00.0: reg 0x10: [io 0xe000-0xe0ff]
> [ 0.603010] pci 0000:03:00.0: reg 0x14: [mem 0xdff3c000-0xdff3ffff 64bit]
> [ 0.604009] pci 0000:03:00.0: reg 0x1c: [mem 0xdff40000-0xdff7ffff 64bit]
> [ 0.605011] pci 0000:03:00.0: reg 0x30: [mem 0xdff80000-0xdfffffff pref]
> ...
>
> cpuinfo:
>
> processor : 31
> vendor_id : AuthenticAMD
> cpu family : 21
> model : 1
> model name : AMD Opteron(tm) Processor 6278
> stepping : 2
> microcode : 0x6000626
> cpu MHz : 1427.124
> cache size : 2048 KB
> physical id : 1
> siblings : 16
> core id : 7
> cpu cores : 8
>
> board:
>
> Manufacturer: Supermicro
> Product Name: H8DG6/H8DGi
>
> BIOS:
>
> Vendor: American Megatrends Inc.
> Version: 2.0b
> Release Date: 03/01/2012
>
> I've attached the lspci -v output and a successful full bootlog as well, with
> various debugging options enabled. Let me know if you need any other info.
>
> Thanks,
>
> Ingo
View attachment "0001-x86-PCI-fix-infinity-loop-in-search-for-64bit-BAR-pl.patch" of type "text/x-patch" (1225 bytes)
View attachment "0002-x86-PCI-only-enable-a-64bit-BAR-on-single-socket-AMD.patch" of type "text/x-patch" (2369 bytes)
View attachment "0003-x86-PCI-limit-the-size-of-the-64bit-BAR-to-256GB.patch" of type "text/x-patch" (1161 bytes)
Powered by blists - more mailing lists