lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220926171104.GA1605932@bhelgaas>
Date:   Mon, 26 Sep 2022 12:11:04 -0500
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Richard Rogalski <rrogalski@...anota.com>
Cc:     Linux Pci <linux-pci@...r.kernel.org>,
        Alex Deucher <alexander.deucher@....com>,
        "David S. Miller" <davem@...emloft.net>,
        sparclinux@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: SPARC64: getting "no compatible bridge window" errors :/

[+cc Alex, David, sparclinux, LKML]

On Sun, Sep 25, 2022 at 06:59:23PM +0200, Richard Rogalski wrote:
> I hope this is the right place for this.

This is great, thanks a lot for your report!  Is this a regression?
If so, what's the most recent kernel that worked?  

> In my dmesg output, I get things like:
> 
> pci 0000:04:00.0: can't claim VGA legacy [mem 0x000a0000-0x000bffff]: no compatible bridge window
> pci 0000:06:00.0: can't claim VGA legacy [mem 0x000a0000-0x000bffff]: no compatible bridge window
> pci 0000:06:00.1: can't claim BAR 0 [mem 0x84110200000-0x84110203fff 64bit]: no compatible bridge window
> 
> I opened a bug for amdgpu [here](https://gitlab.freedesktop.org/drm/amd/-/issues/2169) but looking further into it I think it is caused by deeper PCIe problems :\
> 
> https://gitlab.freedesktop.org/drm/amd/uploads/cbf47807972c8a990bb2a8cdbb39ad9e/8C7CA9QNG dmesg log
> https://gitlab.freedesktop.org/drm/amd/uploads/6a799425dea50febd82f8bc11e54433a/ll.txt lspci -vv
> https://gitlab.freedesktop.org/drm/amd/uploads/7d4a794b1f7d67a1ffcdee5dfdec3ad6/config.txt kernel .config

Your error output attachment [1] contains an address that looks like
it's in 06:00.0 BAR 5:

  pci 0000:06:00.0: reg 0x24: [mem 0x84001200000-0x8400123ffff]
  NON-RESUMABLE ERROR: insn effective address [0x0000084001201410]

This looks like an amdgpu issue.  There have been recent changes like
c1c39032a074 ("drm/amdgpu: make sure to init common IP before gmc")
and dd6aeb4e5f59 ("drm/amdgpu: Don't enable LTR if not supported")
that could be related.

The PCI "no compatible bridge window" warnings are definitely an
issue, but I don't think they're related to the amdgpu crash:

  pci@400: PCI MEM64 [mem 0x84100000000-0x84dffffffff] offset 80000000000
  pci_bus 0000:00: root bus resource [mem 0x84100000000-0x84dffffffff] (bus address [0x4100000000-0x4dffffffff])
  pci 0000:09:00.0: can't claim BAR 0 [mem 0x84120000000-0x8412007ffff 64bit]: no compatible bridge window

Those and this from lspci:

  0000:01:00.0 bridge to [bus 02-09] window [mem 0x4100000000-0x412fffffff pref]
  0000:02:0c.0 bridge to [bus 09]    window [mem 0x4120000000-0x412fffffff pref]
  0000:09:00.0 Intel 82599ES NIC Region 0: Memory at 0x84120000000

are telling us there's something wrong with how the resource-to-bus
offset is being applied.  It looks like the offset was applied to the
NIC BAR, but didn't get applied to the bridge windows.

Could you start a new thread here (linux-kernel@...r.kernel.org,
linux-pci@...r.kernel.org, and sparclinux@...r.kernel.org) for this
issue and attach the dmesg log when booting with "ofpci_debug=1"?

Do the devices we complain about (NICs and storage HBAs 09:00.0,
09:00.1, 0d:00.0, 0d:00.1, 0e:00.0, 0f:00.0, 0001:03:00.0,
0001:03:00.1, 0001:0:00.0, 0001:0a:00.1) work?

Bjorn

[1] https://gitlab.freedesktop.org/drm/amd/uploads/b51f4d6783eeebf90de9a400525d07d6/qq

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ