[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CABXGCsPLAs+rCktbM_ao3bP3VZuaLqXSMpXZt1m-B9nqf91EQw@mail.gmail.com>
Date: Tue, 20 May 2025 14:33:15 +0500
From: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
To: aurabindo.pillai@....com, chiahsuan.chung@....com, ray.wu@....com,
"Wheeler, Daniel" <daniel.wheeler@....com>, "Deucher, Alexander" <alexander.deucher@....com>,
amd-gfx list <amd-gfx@...ts.freedesktop.org>, dri-devel <dri-devel@...ts.freedesktop.org>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
Linux regressions mailing list <regressions@...ts.linux.dev>
Subject: 6.15-rc6/regression/bisected - after commit f1c6be3999d2 error
appeared: *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error
Hi,
After commit f1c6be3999d2 error appears:
[ 1421.701677] amdgpu 0000:03:00.0: [drm] *ERROR*
dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic
data
[ 1421.896810] amdgpu 0000:03:00.0: [drm] *ERROR*
dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic
data
[ 1422.088397] amdgpu 0000:03:00.0: [drm] *ERROR*
dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic
data
[ 1426.448674] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with
your previous command: SMN_C2PMSG_66:0x00000012
SMN_C2PMSG_82:0x00000005
[ 1426.448804] amdgpu 0000:03:00.0: amdgpu: Failed to export SMU metrics table!
[ 1430.149443] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with
your previous command: SMN_C2PMSG_66:0x00000012
SMN_C2PMSG_82:0x00000005
[ 1430.149456] amdgpu 0000:03:00.0: amdgpu: Failed to export SMU metrics table!
[ 1433.846389] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with
your previous command: SMN_C2PMSG_66:0x00000012
SMN_C2PMSG_82:0x00000005
[ 1433.846400] amdgpu 0000:03:00.0: amdgpu: Failed to export SMU metrics table!
[ 1437.543718] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with
your previous command: SMN_C2PMSG_66:0x00000012
SMN_C2PMSG_82:0x00000005
[ 1437.543727] amdgpu 0000:03:00.0: amdgpu: Failed to export SMU metrics table!
[ 1439.966738] watchdog: CPU28: Watchdog detected hard LOCKUP on cpu 28
[ 1439.966742] Modules linked in: uinput rfcomm snd_seq_dummy
snd_hrtimer nft_queue nfnetlink_queue nf_conntrack_netbios_ns
nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib
nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set
nf_tables qrtr bnep sunrpc binfmt_misc amd_atl intel_rapl_msr
intel_rapl_common mt7921e mt7921_common mt792x_lib mt76_connac_lib
edac_mce_amd mt76 btusb btrtl btintel snd_hda_codec_realtek btbcm
btmtk snd_hda_codec_generic snd_hda_scodec_component kvm_amd
snd_hda_codec_hdmi mac80211 bluetooth vfat snd_hda_intel fat
snd_intel_dspcfg kvm snd_intel_sdw_acpi snd_hda_codec snd_hda_core
spd5118 snd_hwdep libarc4 snd_seq irqbypass snd_seq_device wmi_bmof
cfg80211 r8169 rapl joydev snd_pcm snd_timer i2c_piix4 pcspkr k10temp
i2c_smbus snd realtek rfkill soundcore gpio_amdpt gpio_generic loop
nfnetlink zram lz4hc_compress lz4_compress amdgpu amdxcp i2c_algo_bit
drm_ttm_helper ttm drm_exec polyval_clmulni
[ 1439.966788] gpu_sched nvme polyval_generic ghash_clmulni_intel
drm_suballoc_helper drm_panel_backlight_quirks ucsi_ccg sha512_ssse3
nvme_core drm_buddy typec_ucsi sha256_ssse3 drm_display_helper
nvme_keyring typec sha1_ssse3 nvme_auth sp5100_tco cec video wmi fuse
[ 1439.966799] irq event stamp: 235192
[ 1439.966800] hardirqs last enabled at (235191):
[<ffffffffa60012a6>] asm_exc_page_fault+0x26/0x30
[ 1439.966805] hardirqs last disabled at (235192):
[<ffffffffa9ba5277>] irqentry_enter+0x57/0x60
[ 1439.966808] softirqs last enabled at (234272):
[<ffffffffa660ee39>] handle_softirqs+0x579/0x840
[ 1439.966810] softirqs last disabled at (234263):
[<ffffffffa660f236>] __irq_exit_rcu+0x126/0x240
[ 1439.966813] CPU: 28 UID: 1000 PID: 209499 Comm: cc1 Tainted: G
W L 6.15.0-rc5-01-3ce9925823c7d6bb0e6eb951bf2db0e9e182582d+
#1 PREEMPT(lazy)
[ 1439.966817] Tainted: [W]=WARN, [L]=SOFTLOCKUP
[ 1439.966818] Hardware name: ASRock B650I Lightning WiFi/B650I
Lightning WiFi, BIOS 3.08 09/18/2024
[ 1439.966819] RIP: 0010:delay_halt_mwaitx+0x20/0x50
And then the system hangs after SOFTLOCKUP.
Bisect says that this is commit f1c6be3999d2
Author: Aurabindo Pillai <aurabindo.pillai@....com>
Date: Wed Apr 16 11:26:54 2025 -0400
drm/amd/display: more liberal vmin/vmax update for freesync
[Why]
FAMS2 expects vmin/vmax to be updated in the case when freesync is
off, but supported. But we only update it when freesync is enabled.
[How]
Change the vsync handler such that dc_stream_adjust_vmin_vmax() its called
irrespective of whether freesync is enabled. If freesync is supported,
then there is no harm in updating vmin/vmax registers.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3546
Reviewed-by: ChiaHsuan Chung <chiahsuan.chung@....com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@....com>
Signed-off-by: Ray Wu <ray.wu@....com>
Tested-by: Daniel Wheeler <daniel.wheeler@....com>
Signed-off-by: Alex Deucher <alexander.deucher@....com>
(cherry picked from commit cfb2d41831ee5647a4ae0ea7c24971a92d5dfa0d)
Cc: stable@...r.kernel.org
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
Of course I checked revert of commit f1c6be3999d2
And I can confirm that without f1c6be3999d2 this issue is gone.
My machine spec: https://linux-hardware.org/?probe=4635c5fcb1
And I attached below my build config, bisect log and full kernel log.
Aurabindo, can you look, please, ASAP?
--
Best Regards,
Mike Gavrilov.
Download attachment ".config.zip" of type "application/zip" (69220 bytes)
Download attachment "bisect-log-drm-ERROR-dc_dmub_srv_log_diagnostic_data-DMCUB-error.zip" of type "application/zip" (1136 bytes)
Download attachment "dmesg.zip" of type "application/zip" (819465 bytes)
Powered by blists - more mailing lists