[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260107105225.37703-1-sunlightlinux@gmail.com>
Date: Wed, 7 Jan 2026 12:52:25 +0200
From: "Ionut Nechita (Sunlight Linux)" <sunlightlinux@...il.com>
To: alexdeucher@...il.com
Cc: alexander.deucher@....com,
amd-gfx@...ts.freedesktop.org,
christian.koenig@....com,
dri-devel@...ts.freedesktop.org,
ionut_n2001@...oo.com,
linux-kernel@...r.kernel.org,
sunlightlinux@...il.com,
superm1@...nel.org
Subject: Re: [PATCH 1/1] drm/amdgpu: Fix TLB flush failures after hibernation resume
Hi Alex,
Thank you for the detailed review and for pointing out the ordering issue.
You're absolutely right - I misunderstood the call sequence. Setting
resume_gpu_stable to false in amdgpu_device_resume() happens after
gfx_v9_0_cp_resume(), which defeats the purpose and permanently
disables the KIQ path.
However, I'm still experiencing the TLB flush failures after hibernation
resume on AMD Cezanne (Renoir):
amdgpu: TLB flush failed for PASID xxxxx
amdgpu: failed to write reg 28b4 wait reg 28c6
amdgpu: failed to write reg 1a6f4 wait reg 1a706
If kiq sched.ready is being handled correctly as you described, what
else could cause these failures during resume? Are there any known
issues with KIQ-based TLB invalidation after hibernation on GFX9?
Should I investigate:
- Timing issues with KIQ command submission during early resume?
- Power/clock gating states affecting KIQ functionality?
- Missing synchronization after KIQ initialization?
Any guidance on the correct direction to investigate would be appreciated.
Thanks,
Ionut
Powered by blists - more mailing lists