[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ee63ca7d-77d2-44d8-973b-7276f8c4d4a5@amd.com>
Date: Mon, 10 Nov 2025 12:24:46 +0100
From: Christian König <christian.koenig@....com>
To: Philipp Stanner <phasta@...nel.org>, David Airlie <airlied@...il.com>,
Simona Vetter <simona@...ll.ch>, Alex Deucher <alexander.deucher@....com>,
Andrey Grodzovsky <Andrey.Grodzovsky@....com>, dakr@...nel.org,
Matthew Brost <matthew.brost@...el.com>
Cc: dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
stable@...r.kernel.org
Subject: Re: [PATCH] drm/sched: Fix UB in spsc_queue
As far as I can see that is not correct or rather not complete.
The peek function should only be used to opportunistically look at the top of the queue. It would only be problematic if it returns a non NULL value once and then a NULL value later.
The whole idea of the SPSC is that it is barrier-free and the signaling of new entries to the consumer side is providing the barrier.
So basically on the provider side you have
spsc_push(entry)
wake_up(consumer)
And on the consumer side you have:
woken_up_by_provider() {
entry = spsc_peek();
...
spsc_pop();
}
The problem we are facing here is that the spsc only provides the guarantee that you see the entry pointer, but not the content of entry itself.
So use cases like:
woken_up_by_provider() {
while (entry = spsc_peek()) {
...
spsc_pop();
}
}
Are illegal since you don't have the correct memory barriers any more.
Took me an eternity to understand that as well, so bear with me that I didn't previously explained that.
Question is what should we do?
Regards,
Christian.
On 11/10/25 09:19, Philipp Stanner wrote:
> The spsc_queue is an unlocked, highly asynchronous piece of
> infrastructure. Its inline function spsc_queue_peek() obtains the head
> entry of the queue.
>
> This access is performed without READ_ONCE() and is, therefore,
> undefined behavior. In order to prevent the compiler from ever
> reordering that access, or even optimizing it away, a READ_ONCE() is
> strictly necessary. This is easily proven by the fact that
> spsc_queue_pop() uses this very pattern to access the head.
>
> Add READ_ONCE() to spsc_queue_peek().
>
> Cc: stable@...r.kernel.org # v4.16+
> Fixes: 27105db6c63a ("drm/amdgpu: Add SPSC queue to scheduler.")
> Signed-off-by: Philipp Stanner <phasta@...nel.org>
> ---
> I think this makes it less broken, but I'm not even sure if it's enough
> or more memory barriers or an rcu_dereference() would be correct. The
> spsc_queue is, of course, not documented and the existing barrier
> comments are either false or not telling.
>
> If someone has an idea, shoot us the info. Otherwise I think this is the
> right thing to do for now.
>
> P.
> ---
> include/drm/spsc_queue.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/drm/spsc_queue.h b/include/drm/spsc_queue.h
> index ee9df8cc67b7..39bada748ffc 100644
> --- a/include/drm/spsc_queue.h
> +++ b/include/drm/spsc_queue.h
> @@ -54,7 +54,7 @@ static inline void spsc_queue_init(struct spsc_queue *queue)
>
> static inline struct spsc_node *spsc_queue_peek(struct spsc_queue *queue)
> {
> - return queue->head;
> + return READ_ONCE(queue->head);
> }
>
> static inline int spsc_queue_count(struct spsc_queue *queue)
Powered by blists - more mailing lists