lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 6 Dec 2017 17:16:49 +0100
From:   Ingo Molnar <mingo@...nel.org>
To:     Bjorn Helgaas <helgaas@...nel.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
        Christian König <christian.koenig@....com>,
        Andy Shevchenko <andy.shevchenko@...il.com>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: [regression] PCI early boot hang on certain AMD systems (was: Re:
 [GIT PULL] PCI changes for v4.15)


Hi,

* Bjorn Helgaas <helgaas@...nel.org> wrote:

> PCI changes:

> Christian König (4):
>       x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)

In v4.15 one of my test systems broke, it hangs in early bootup, during early PCI 
setup:

[    2.262005] pci 0000:00:18.1: adding root bus resource [mem 0x1027000000-0xfcffffffff 64bit pref window] <--- new resource
[    2.270081] pci 0000:00:18.2: [1022:1602] type 00 class 0x060000
[    2.271081] pci 0000:00:18.3: [1022:1603] type 00 class 0x060000
[    2.272083] pci 0000:00:18.4: [1022:1604] type 00 class 0x060000
[    2.273079] pci 0000:00:18.5: [1022:1605] type 00 class 0x060000
[    2.274083] pci 0000:00:19.0: [1022:1600] type 00 class 0x060000
[    2.275089] pci 0000:00:19.1: [1022:1601] type 00 class 0x060000
[  hard hang ]

I have bisected the hang to:

  fa564ad96366: x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)

Reverting the commit makes the system boot again. The 'new resource' line above is 
I believe the new BAR added by the commit.

I've attached the earlyprintk boot log of the hang, with a few printks added to 
pci_amd_enable_64bit_bar() of the relevant fields:

+       printk("res->start: %016llx\n", res->start);
+       printk("res->end:   %016llx\n", res->end);
+       printk("base:       %08x\n", base);
+       printk("high:       %08x\n", high);
+       printk("limit:      %08x\n", limit);
+       printk("slot:       %d\n", i);

[    2.261090] pci 0000:00:18.1: [1022:1601] type 00 class 0x060000
[    2.262005] pci 0000:00:18.1: adding root bus resource [mem 0x1027000000-0xfcffffffff 64bit pref window]
[    2.264001] res->start: 0000001027000000
[    2.265001] res->end:   000000fcffffffff
[    2.266001] base:       10270003
[    2.267001] high:       00000000
[    2.268001] limit:      fd000000
[    2.269001] slot:       1
[    2.270081] pci 0000:00:18.2: [1022:1602] type 00 class 0x060000
[    2.271081] pci 0000:00:18.3: [1022:1603] type 00 class 0x060000
[    2.272083] pci 0000:00:18.4: [1022:1604] type 00 class 0x060000
[    2.273079] pci 0000:00:18.5: [1022:1605] type 00 class 0x060000
[    2.274083] pci 0000:00:19.0: [1022:1600] type 00 class 0x060000
[    2.275089] pci 0000:00:19.1: [1022:1601] type 00 class 0x060000

On a sucessful bootup the system would continue with:

[    0.583060] pci 0000:00:19.2: [1022:1602] type 00 class 0x060000
[    0.584079] pci 0000:00:19.3: [1022:1603] type 00 class 0x060000
[    0.585084] pci 0000:00:19.4: [1022:1604] type 00 class 0x060000
[    0.586079] pci 0000:00:19.5: [1022:1605] type 00 class 0x060000
[    0.588039] pci 0000:00:1a.0: [1022:1600] type 00 class 0x060000
[    0.589090] pci 0000:00:1a.1: [1022:1601] type 00 class 0x060000
[    0.590079] pci 0000:00:1a.2: [1022:1602] type 00 class 0x060000
[    0.591080] pci 0000:00:1a.3: [1022:1603] type 00 class 0x060000
[    0.593006] pci 0000:00:1a.4: [1022:1604] type 00 class 0x060000
[    0.594079] pci 0000:00:1a.5: [1022:1605] type 00 class 0x060000
[    0.595082] pci 0000:00:1b.0: [1022:1600] type 00 class 0x060000
[    0.596087] pci 0000:00:1b.1: [1022:1601] type 00 class 0x060000
[    0.597083] pci 0000:00:1b.2: [1022:1602] type 00 class 0x060000
[    0.598080] pci 0000:00:1b.3: [1022:1603] type 00 class 0x060000
[    0.599085] pci 0000:00:1b.4: [1022:1604] type 00 class 0x060000
[    0.600079] pci 0000:00:1b.5: [1022:1605] type 00 class 0x060000
[    0.601124] pci 0000:03:00.0: [1000:0072] type 00 class 0x010700
[    0.602037] pci 0000:03:00.0: reg 0x10: [io  0xe000-0xe0ff]
[    0.603010] pci 0000:03:00.0: reg 0x14: [mem 0xdff3c000-0xdff3ffff 64bit]
[    0.604009] pci 0000:03:00.0: reg 0x1c: [mem 0xdff40000-0xdff7ffff 64bit]
[    0.605011] pci 0000:03:00.0: reg 0x30: [mem 0xdff80000-0xdfffffff pref]
...

cpuinfo:

 processor       : 31
 vendor_id       : AuthenticAMD
 cpu family      : 21
 model           : 1
 model name      : AMD Opteron(tm) Processor 6278
 stepping        : 2
 microcode       : 0x6000626
 cpu MHz         : 1427.124
 cache size      : 2048 KB
 physical id     : 1
 siblings        : 16
 core id         : 7
 cpu cores       : 8

board:

        Manufacturer: Supermicro
        Product Name: H8DG6/H8DGi

BIOS:

        Vendor: American Megatrends Inc.
        Version: 2.0b      
        Release Date: 03/01/2012

I've attached the lspci -v output and a successful full bootlog as well, with 
various debugging options enabled. Let me know if you need any other info.

Thanks,

	Ingo

View attachment "pci-hang.log" of type "text/plain" (24674 bytes)

View attachment "lspci.txt" of type "text/plain" (10372 bytes)

View attachment "boot.log" of type "text/plain" (65610 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ