linux-kernel - Re: [RFC PATCH] drm/amdgpu: Avoid unnecessary Call Traces in amdgpu_irq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1a0dab04-cb13-9307-2853-38221193e38e@loongson.cn>
Date: Mon, 19 Jan 2026 16:53:30 +0800
From: Tiezhu Yang <yangtiezhu@...ngson.cn>
To: Christian König <christian.koenig@....com>,
 Alex Deucher <alexander.deucher@....com>
Cc: Alan Liu <haoping.liu@....com>, amd-gfx@...ts.freedesktop.org,
 linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] drm/amdgpu: Avoid unnecessary Call Traces in
 amdgpu_irq_put()

On 2026/1/16 下午6:03, Christian König wrote:
> On 1/16/26 02:20, Tiezhu Yang wrote:
>> On 2026/1/15 下午9:47, Christian König wrote:
>>> On 1/15/26 02:28, Tiezhu Yang wrote:
>>>> Currently, there are many Call Traces when booting kernel on LoongArch,
>>>> here are the trimmed kernel log messages:
>>>>
>>>>     amdgpu 0000:07:00.0: amdgpu: hw_init of IP block <gfx_v6_0> failed -110
>>>>     amdgpu 0000:07:00.0: amdgpu: amdgpu_device_ip_init failed
>>>>     amdgpu 0000:07:00.0: amdgpu: Fatal error during GPU init
>>>>     amdgpu 0000:07:00.0: amdgpu: amdgpu: finishing device.
>>>>     ------------[ cut here ]------------
>>>>     WARNING: drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:639 at amdgpu_irq_put+0xb0/0x140 [amdgpu], CPU#0: kworker/0:0/9
>>>>     ...
>>>>     Call Trace:

...

> The warning can basically only be triggered by two conditions:
> 1. A fatal problem while loading the driver and the error handling is not 100% clean.
> 2. A driver coding error.
> 
> And we really need to catch all of those, so there is no real rational to limit the warning.
> 
> I mean when you run into any of those they should potentially be fixed at some point.

I did the following change and it can fix the problem, given that I am
not familiar with amdgpu driver, could you please check it? If it is OK,
I will send a formal patch later.

----->8-----
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index 8112ffc85995..ac19565e7c45 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -138,6 +138,9 @@ void amdgpu_irq_disable_all(struct amdgpu_device *adev)
                         if (!src || !src->funcs->set || !src->num_types)
                                 continue;

+                       kfree(src->enabled_types);
+                       src->enabled_types = NULL;
+
                         for (k = 0; k < src->num_types; ++k) {
                                 r = src->funcs->set(adev, src, k,
 
AMDGPU_IRQ_STATE_DISABLE);

Thanks,
Tiezhu