[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20251114010619.184918-1-dev@illwieckz.net>
Date: Fri, 14 Nov 2025 02:06:19 +0100
From: Thomas Debesse <dev@...wieckz.net>
To: Alex Deucher <alexander.deucher@....com>,
Christian König <christian.koenig@....com>,
amd-gfx@...ts.freedesktop.org,
linux-kernel@...r.kernel.org
Cc: Timur Kristóf <timur.kristof@...il.com>,
Thomas Debesse <dev@...wieckz.net>
Subject: [PATCH] drm/radeon: no 'auto' default DPM level on GRENADA 0x67B0/1 Radeon R9 390(X)
The "auto" DPM level is known to be faulty on this specific GPU model,
so having it set by default makes the kernel hang quickly after booting.
The bug has been reported on 2015-09-04, and the present patch
writen on 2017-09-10, unfortunately it never made it upstream:
https://bugs.freedesktop.org/show_bug.cgi?id=91880#c167
This bug is making Linux-based systems unusable as the kernel would
experience the hang at startup, sometimes even before the users get a
shell and become able to switch the DPM level to another safe options.
This bug also prevents to install Linux distributions on workstations
featuring this GPU since the kernel from the installation media will
experience the hang before the installation process completes itself.
The two other known safe DPM levels are "low" or "high".
This specific GRENADA model (variant of HAWAII), also known as R9 390
and R9 390X, was an expensive and powerful top-of-the line GPU card for
workstations. It was a flagship of the “Pirate Island” generation, the
one targeted by ROCm at the time, meant for high-performance gaming,
media production and GPGPU computation. Hence it is safe to assume the
users expect the default DPM level for it to be "high" when "auto" is
not available.
The users can still manually switch the DPM level to "low" when they
care about energy saving, thanks to this fix allowing them to complete
the system init and be able to get a shell to change that settings.
The bug was reported as bug #91880 on the freedesktop.org bugzilla:
https://bugs.freedesktop.org/show_bug.cgi?id=91880
for which the discussion continued there as #1222 on mesa/mesa GitLab:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/1222#note_764999
then continued there as #1816 on drm/amd GitLab:
https://gitlab.freedesktop.org/drm/amd/-/issues/1816
Details about the bug also have been discussed there:
https://bugs.freedesktop.org/show_bug.cgi?id=92495
https://bugs.freedesktop.org/show_bug.cgi?id=93288
Without this patch, users have to rely on third-party tools setting
the DPM profile early as part of their init system, like this:
https://github.com/illwieckz/dpm-query
This patch was initially written for both radeon and amdgpu drivers as
as both suffered from the problem at the time. This now targets only
the radeon driver. It looks like the amdgpu driver doesn't suffer from
the hang anymore, so it may be theoretically possible to implement a
better fix for radeon, but none has been done in 10 years, so we better
not wait for another decade. This looks good enough to preven the hang.
This patch does not prevent the users to set the faulty DPM level and
then experience the crash because of his own actions, what this patch
does is to not set this faulty level by default, so the kernel doesn't
experience a hang with its default configuration.
It is also known that plugging a 4K screen may make the bug disappear,
likely by triggering some unexposed performance profie configuration,
but we cannot expect users to own such screen to run the Linux kernel.
The amdgpu driver being made the default one for this hardware is
also a way to workaround the bug. This is finally becoming a thing:
https://lists.freedesktop.org/archives/amd-gfx/2025-November/133615.html
But such switch is not meant to be backported to older kernels with
their less complete amdgpu driver, so this patch better be backported
to any older kernel running the radeon driver by default with such GPU.
The support for such hardware isn't removed in the radeon yet so there
can still be some situations where the radeon driver could be used with
future kernels, so this patch is still relevant for the development
branches and future stable ones.
This patch was first written against Linux 4.12.11 and was rebased
over Linux 6.16 without any conflict, so it should be straightforward
to rebase it over any kernel version from the last decade.
I rebased that patch on 2025-10-13 over Linux 6.16 because that was
the version Timur Kristóf was using when he reproduced the hang on
2025-10-23, so I could test with an environment as close as his:
https://gitlab.freedesktop.org/drm/amd/-/issues/1816#note_3160858
I reproduced the hang on 2025-11-14 after making sure all the
hardware and configuration to reproduce the hang was there:
a Radeon R9 390X GPU, a non-4K screen (1080p), and the radeon driver.
I confirm this patch still works as expected at preventing the hang
to happen:
https://gitlab.freedesktop.org/drm/amd/-/issues/1816#note_3192547
Reported-by: Lauri Gustafsson
Reported-by: John Frei
Reported-by: Thomas Debesse <dev@...wieckz.net>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1816
Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/1222
Link: https://bugs.freedesktop.org/show_bug.cgi?id=91880
Link: https://bugs.freedesktop.org/show_bug.cgi?id=92495
Link: https://bugs.freedesktop.org/show_bug.cgi?id=93288
Signed-off-by: Thomas Debesse <dev@...wieckz.net>
---
drivers/gpu/drm/radeon/radeon_pm.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/radeon/radeon_pm.c b/drivers/gpu/drm/radeon/radeon_pm.c
index b4fb7e70320b..dc09c9a58b01 100644
--- a/drivers/gpu/drm/radeon/radeon_pm.c
+++ b/drivers/gpu/drm/radeon/radeon_pm.c
@@ -1421,7 +1421,20 @@ static int radeon_pm_init_dpm(struct radeon_device *rdev)
/* default to balanced state */
rdev->pm.dpm.state = POWER_STATE_TYPE_BALANCED;
rdev->pm.dpm.user_state = POWER_STATE_TYPE_BALANCED;
- rdev->pm.dpm.forced_level = RADEON_DPM_FORCED_LEVEL_AUTO;
+
+ switch (rdev->pdev->device) {
+ case 0x67B0:
+ case 0x67B1:
+ /* The "auto" DPM level is known to hang these
+ * high-performance grenada variants.
+ */
+ rdev->pm.dpm.forced_level = RADEON_DPM_FORCED_LEVEL_HIGH;
+ break;
+ default:
+ rdev->pm.dpm.forced_level = RADEON_DPM_FORCED_LEVEL_AUTO;
+ break;
+ }
+
rdev->pm.default_sclk = rdev->clock.default_sclk;
rdev->pm.default_mclk = rdev->clock.default_mclk;
rdev->pm.current_sclk = rdev->clock.default_sclk;
--
2.43.0
Powered by blists - more mailing lists