linux-kernel - [PATCH 1/2] drm/nouveau/disp/nv50-: Set lock_core in curs507a

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20251219215344.170852-2-lyude@redhat.com>
Date: Fri, 19 Dec 2025 16:52:02 -0500
From: Lyude Paul <lyude@...hat.com>
To: dri-devel@...ts.freedesktop.org,
	nouveau@...ts.freedesktop.org,
	linux-kernel@...r.kernel.org
Cc: stable@...r.kernel.org,
	"Timur Tabi" <ttabi@...dia.com>,
	"Dave Airlie" <airlied@...hat.com>,
	"Maarten Lankhorst" <maarten.lankhorst@...ux.intel.com>,
	"Ben Skeggs" <bskeggs@...dia.com>,
	"Simona Vetter" <simona@...ll.ch>,
	"Ben Skeggs" <bskeggs@...hat.com>,
	"David Airlie" <airlied@...il.com>,
	"Thomas Zimmermann" <tzimmermann@...e.de>,
	"Maxime Ripard" <mripard@...nel.org>,
	"Danilo Krummrich" <dakr@...nel.org>,
	"Lyude Paul" <lyude@...hat.com>
Subject: [PATCH 1/2] drm/nouveau/disp/nv50-: Set lock_core in curs507a_prepare

For a while, I've been seeing a strange issue where some (usually not all)
of the display DMA channels will suddenly hang, particularly when there is
a visible cursor on the screen that is being frequently updated, and
especially when said cursor happens to go between two screens. While this
brings back lovely memories of fixing Intel Skylake bugs, I would quite
like to fix it :).

It turns out the problem that's happening here is that we're managing to
reach nv50_head_flush_set() in our atomic commit path without actually
holding nv50_disp->mutex. This means that cursor updates happening in
parallel (along with any other atomic updates that need to use the core
channel) will race with eachother, which eventually causes us to corrupt
the pushbuffer - leading to a plethora of various GSP errors, usually:

  nouveau 0000:c1:00.0: gsp: Xid:56 CMDre 00000000 00000218 00102680 00000004 00800003
  nouveau 0000:c1:00.0: gsp: Xid:56 CMDre 00000000 0000021c 00040509 00000004 00000001
  nouveau 0000:c1:00.0: gsp: Xid:56 CMDre 00000000 00000000 00000000 00000001 00000001

The reason this is happening is because generally we check whether we need
to set nv50_atom->lock_core at the end of nv50_head_atomic_check().
However, curs507a_prepare is called from the fb_prepare callback, which
happens after the atomic check phase. As a result, this can lead to commits
that both touch the core channel but also don't grab nv50_disp->mutex.

So, fix this by making sure that we set nv50_atom->lock_core in
cus507a_prepare().

Signed-off-by: Lyude Paul <lyude@...hat.com>
Fixes: 1590700d94ac ("drm/nouveau/kms/nv50-: split each resource type into their own source files")
Cc: <stable@...r.kernel.org> # v4.18+
---
 drivers/gpu/drm/nouveau/dispnv50/curs507a.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/nouveau/dispnv50/curs507a.c b/drivers/gpu/drm/nouveau/dispnv50/curs507a.c
index a95ee5dcc2e39..1a889139cb053 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/curs507a.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/curs507a.c
@@ -84,6 +84,7 @@ curs507a_prepare(struct nv50_wndw *wndw, struct nv50_head_atom *asyh,
 		asyh->curs.handle = handle;
 		asyh->curs.offset = offset;
 		asyh->set.curs = asyh->curs.visible;
+		nv50_atom(asyh->state.state)->lock_core = true;
 	}
 }

-- 
2.52.0