lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230119170003.GA316230@bhelgaas>
Date:   Thu, 19 Jan 2023 11:00:03 -0600
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Zeno Davatz <zdavatz@...il.com>
Cc:     linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
        Bruno Moreira-Guedes <brunodout.dev@...il.com>,
        Krzysztof WilczyƄski <kw@...ux.com>,
        Bjorn Helgaas <bjorn@...gaas.com>
Subject: Re: [Bug 216859] New: PCI bridge to bus boot hang at enumeration

[+cc bjorn@...gaas.com to avoid spamassassin]

On Wed, Jan 18, 2023 at 06:04:58PM -0600, Bjorn Helgaas wrote:
> On Fri, Jan 06, 2023 at 05:42:33PM +0100, Zeno Davatz wrote:
> > On Fri, Dec 30, 2022 at 7:50 PM Bjorn Helgaas <helgaas@...nel.org> wrote:
> > > On Wed, Dec 28, 2022 at 12:42:34PM -0600, Bjorn Helgaas wrote:
> > > > On Wed, Dec 28, 2022 at 06:42:38PM +0100, Zeno Davatz wrote:
> > > > > On Wed, Dec 28, 2022 at 1:02 PM Bjorn Helgaas <helgaas@...nel.org> wrote:
> > > > > > On Wed, Dec 28, 2022 at 08:37:52AM +0000, bugzilla-daemon@...nel.org wrote:
> > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=216859
> > > > > >
> > > > > > >            Summary: PCI bridge to bus boot hang at enumeration
> > > > > > >     Kernel Version: 6.1-rc1
> > > > > > > ...
> > > > > >
> > > > > > > With Kernel 6.1-rc1 the enumeration process stopped working for me,
> > > > > > > see attachments.
> > > > > > >
> > > > > > > The enumeration works fine with Kernel 6.0 and below.
> > > > > > >
> > > > > > > Same problem still exists with v6.1. and v6.2.-rc1
> > > > > >
> > > > > > Thank you very much for your report, Zeno!
> > > > > >
> > > > > > v6.0 works, v6.1-rc1 fails.  Would you mind booting v6.1-rc1 with the
> > > > > > "ignore_loglevel initcall_debug" kernel parameters and taking a photo
> > > > > > when it hangs?
> > > > >
> > > > > I will try this after Januar 7th 2023.
> > 
> > I updated the issue:
> > 
> > https://bugzilla.kernel.org/show_bug.cgi?id=216859
> > 
> > I booted with the option: "ignore_loglevel initcall_debug"
> 
> Thanks!  There's so much pcie output in that picture that we can't see
> any of the initcall logging.  Can you capture another movie, but use
> kernel parameters like "ignore_loglevel initcall_debug boot_delay=100"
> to slow things down?  The full-speed boot is too fast for the camera
> to capture all the output.  You can do this on any convenient kernel
> that hangs.

Thanks for the new movie!  The last initcalls I see before the hang
are:

  init_mqueue_fs
  key_proc_init
  jent_mod_init

We must have returned from jent_mod_init() because I think the "saving
config space" messages we see at the hang are from
pcie_portdrv_init().

I built 833477fce7a1 ("Merge tag 'sound-6.1-rc1' of
git://git.kernel.org/pub/scl) with your .config and when I boot it on
qemu, I see this:

  calling  jent_mod_init+0x0/0x32 @ 1
  initcall jent_mod_init+0x0/0x32 returned 0 after 27185 usecs
  calling  af_alg_init+0x0/0x45 @ 1
  NET: Registered PF_ALG protocol family
  ...
  calling  sg_pool_init+0x0/0xb4 @ 1
  initcall sg_pool_init+0x0/0xb4 returned 0 after 462 usecs
  calling  pcie_portdrv_init+0x0/0x43 @ 1
  pcieport 0000:00:1c.0: vgaarb: pci_notify
  pcieport 0000:00:1c.0: runtime IRQ mapping not provided by arch
  pcieport 0000:00:1c.0: enabling bus mastering
  pcieport 0000:00:1c.0: PME: Signaling with IRQ 24
  pcieport 0000:00:1c.0: AER: enabled with IRQ 24
  pcieport 0000:00:1c.0: saving config space at offset 0x0 (reading 0x34208086)
  pcieport 0000:00:1c.0: saving config space at offset 0x4 (reading 0x100507)
  pcieport 0000:00:1c.0: saving config space at offset 0x8 (reading 0x6040002)
  ...

Would you mind trying again with "boot_delay=1000 pcie_ports=compat"?

"boot_delay=1000" should slow it down more (all the action is in the
last 3 seconds and it's still hard to see) and "pcie_ports=compat"
should turn off the PCIe port driver.

Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ