lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <47e950989fe0fc297a2272139d69a5a7c4c98de5.camel@uvos.xyz>
Date:   Wed, 01 Nov 2023 13:35:47 +0100
From:   Carl Klemm <carl@...s.xyz>
To:     christian.koenig@....com, alexander.deucher@....com
Cc:     dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: [BUG] gpu: drm: amd: noretry=0 causes failure in
 amdgpu_device_ip_resume on vega10

Hi,

When migrateing from 5.15 to 6.5.9 i noticed that noretry no longer
function on vega10 (Instinct MI25). The device will fail to start:

[   40.080411] amdgpu: fw load failed
[   40.083816] amdgpu: smu firmware loading failed
[   40.088350] amdgpu 0000:83:00.0: amdgpu: amdgpu_device_ip_resume
failed (-22).

I have also repoduced the same issue on 6.1.55
It is also possible that the issue was caused by newer gpu firmware,
instead of the change in kernel. The issue was repduced with the
firmware from linux-firmware-20230804.

for full dmesg see: https://uvos.xyz/noretry.dmesg.log

The same system also contains 2 vega20 and 1 navi21 device that both
work fine with noretry=0. For more information on the system this
problem was encountered on see this rocminfo dump:
https://uvos.xyz/rocminfo.log

Regards,
Carl

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ