lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <o2bL8MtD_40-lf8GlslTw-AZpUPzm8nmfCnJKvS8RQ3NOzOW1uq1dVCEfRpUjJ2i7G2WjfQhk2IWZ7oGp-7G-jXN4qOdtnyOcjRR0PZWK5I=@r26.me>
Date: Mon, 09 Jun 2025 02:34:02 +0000
From: rio@....me
To: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>, "regressions@...ts.linux.dev" <regressions@...ts.linux.dev>, "amd-gfx@...ts.freedesktop.org" <amd-gfx@...ts.freedesktop.org>
Subject: [REGRESSION] amdgpu fails to load external RX 580 since PCI: Allow relaxed bridge window tail sizing for optional resources

Hello,

I have an external Radeon RX580 on my machine connected via Thunderbolt, and
since upgrading from 6.14.1 the setup stopped working. Dmesg showed warning from
resource sanity check, followed by a stack trace https://pastebin.com/njR55rQW.
Relevant snippet:

[   12.134907] amdgpu 0000:06:00.0: BAR 2 [mem 0x6000000000-0x60001fffff 64bit pref]: releasing
[   12.134910] [drm:amdgpu_device_resize_fb_bar [amdgpu]] *ERROR* Problem resizing BAR0 (-16).
[   12.135456] amdgpu 0000:06:00.0: BAR 2 [mem 0x6000000000-0x60001fffff 64bit pref]: assigned
[   12.135524] amdgpu 0000:06:00.0: amdgpu: VRAM: 8192M 0x000000F400000000 - 0x000000F5FFFFFFFF (8192M used)
[   12.135527] amdgpu 0000:06:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
[   12.135536] resource: resource sanity check: requesting [mem 0x0000000000000000-0xffffffffffffffff], which spans more than PCI Bus 0000:00 [mem 0x000a0000-0x000bffff window]
[   12.135542] ------------[ cut here ]------------
[   12.135543] WARNING: CPU: 6 PID: 599 at arch/x86/mm/pat/memtype.c:721 memtype_reserve_io+0xfc/0x110
[   12.135551] Modules linked in: ccm amdgpu(+) snd_hda_codec_realtek ...
[   12.135652] CPU: 6 UID: 0 PID: 599 Comm: (udev-worker) Tainted: G S                  6.15.0-13743-g8630c59e9936 #16 PREEMPT(full)  3b462c924b3ffd8156fc3b77bcc8ddbf7257fa57
[   12.135654] Tainted: [S]=CPU_OUT_OF_SPEC
[   12.135655] Hardware name: COPELION INTERNATIONAL INC. ZX Series/ZX Series, BIOS 1.07.08TCOP3 03/27/2020
[   12.135656] RIP: 0010:memtype_reserve_io+0xfc/0x110
[   12.135659] Code: aa fb ff ff b8 f0 ff ff ff eb 88 8b 54 24 04 4c 89 ee 48 89 df e8 04 fe ff ff 85 c0 75 db 8b 54 24 04 41 89 16 e9 69 ff ff ff <0f> 0b e9 4b ff ff ff e8 b8 5c fc 00 0f 1f 84 00 00 00 00 00 90 90

Bisecting the stable branch pointed me to the following commit:

commit 22df32c984be9e9145978acf011642da042a2af3 (HEAD)
Author: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
Date:   Mon Dec 16 19:56:11 2024 +0200

    PCI: Allow relaxed bridge window tail sizing for optional resources
    
    [ Upstream commit 67f9085596ee55dd27b540ca6088ba0717ee511c ]

I've tested on stable (as of now 8630c59e99363c4b655788fd01134aef9bcd9264), and
the issue persists. Reverting the offending commit via `git revert -n
22df32c984be9e9145978acf011642da042a2af3` allowed amdgpu to load again.
Dmesg: https://pastebin.com/xd76rDsW.

Additional information
   - Distribution: Artix
   - Arch: x86_64
   - Kernel config: https://pastebin.com/DWSERJL5
   - eGPU adapter: https://www.adt.link/product/R43SG-TB3.html
   - Booting with pci=realloc,hpbussize=0x33,hpmmiosize=256M,hpmmioprefsize=1G

I'm reporting here as these are the contacts from the commit message. Please let me know if there's a more appropriate place for this, as well as any more information I can provide.

Thanks,
Rio


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ