[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <11b7a8a5-170f-4815-a8ac-5dba2d8e67a1@igalia.com>
Date: Fri, 24 Oct 2025 09:31:26 +0100
From: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>
To: Philipp Stanner <phasta@...nel.org>,
 Sumit Semwal <sumit.semwal@...aro.org>, Gustavo Padovan
 <gustavo@...ovan.org>, Christian König
 <christian.koenig@....com>
Cc: linux-media@...r.kernel.org, dri-devel@...ts.freedesktop.org,
 linux-kernel@...r.kernel.org, Danilo Krummrich <dakr@...nel.org>
Subject: Re: [PATCH] dma-fence: Correct return of dma_fence_driver_name()
On 24/10/2025 08:50, Philipp Stanner wrote:
> To decouple the dma_fence_ops lifetime from dma_fences lifetime RCU
> support was added to said function, coupled with using the signaled bit
> to detect whether the fence_ops might be gone already.
> 
> When implementing that a wrong string was set as a default return
> parameter, indicating that every driver whose fence is already signalled
> must be detached, which is frankly wrong.
Depends on how you look at it. After being signaled fence has to be 
detached from the driver. Ie. nothing belonging to this driver must be 
accessed via the fence.
I started with names and Christian has recently continued with ops.
> Reported-by: Danilo Krummrich <dakr@...nel.org>
> Fixes: 506aa8b02a8d ("dma-fence: Add safe access helpers and document the rules")
> Signed-off-by: Philipp Stanner <phasta@...nel.org>
> ---
> When this was merged, it sadly slipped by me. I think this entire RCU
> mechanism was / is an overengineered idea.
> 
> If we look at who actually uses dma_fence_driver_name() and
> dma_fence_timeline_name() – functions from which the largest share of
> the fence_ops vs. fence lifetime issue stems from – we discover that
> there is a single user:
> 
> i915.
Not quite. The trigger event for fixing this was actually xe where use 
after free was achievable with a trivial set of userspace steps. See 
reproducer at:
https://lore.kernel.org/igt-dev/20250312131835.83983-1-tvrtko.ursulin@igalia.com/
Essentially any fence exporter whose fence can be exported either 
directly via sync file, or via syncobj to sync file export, and has 
state accessible via the fence ops which may be freed after the fence is 
signalled, or if the driver can be unbound from the device and unloaded, 
is vulnerable.
> Isn't that driver even deprecated?
Not exactly. It is not getting support for new hardware generations, 
while the new driver is not supporting old. There is a cut off point and 
an overlap of around one generation. Although I am not even sure this 
overlap is officially supported by Intel.
> I think the better thing to do is: remove these functions alltogether,
> or at least deprecate them. Then the only lifetime issue left so solve
> is the callback functions.
That would be nice, I also do not see much value in exporting names to 
userspace. But first more conversation around breaking the sync file ABI 
needs to happen. I think we had a little bit of it when changing the 
names of signalled fences and thinking was existing tools which look at 
the names will mostly survive it. Not sure if they would if unsignalled 
names would change.
> 
> P.
> ---
>   drivers/dma-buf/dma-fence.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index 3f78c56b58dc..1875a0abebd3 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -1111,7 +1111,7 @@ const char __rcu *dma_fence_driver_name(struct dma_fence *fence)
>   	if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>   		return fence->ops->get_driver_name(fence);
>   	else
> -		return "detached-driver";
> +		return "driver-whose-fence-is-already-signalled";
IMHO unnecessarily verbose and whether or not changing it to anything 
different warrants a Fixes: tag is debatable.
Regards,
Tvrtko
>   }
>   EXPORT_SYMBOL(dma_fence_driver_name);
>   
Powered by blists - more mailing lists
 
