lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aac87a7e-5a45-4b54-a43b-cb92c5df669c@amd.com>
Date: Thu, 22 May 2025 15:05:02 +0200
From: Christian König <christian.koenig@....com>
To: phasta@...nel.org, Lyude Paul <lyude@...hat.com>,
 Danilo Krummrich <dakr@...nel.org>, David Airlie <airlied@...il.com>,
 Simona Vetter <simona@...ll.ch>, Sumit Semwal <sumit.semwal@...aro.org>
Cc: dri-devel@...ts.freedesktop.org, nouveau@...ts.freedesktop.org,
 linux-kernel@...r.kernel.org, linux-media@...r.kernel.org
Subject: Re: [PATCH 2/2] drm/nouveau: Don't signal when killing the fence
 context

On 5/22/25 14:42, Philipp Stanner wrote:
> On Thu, 2025-05-22 at 14:34 +0200, Christian König wrote:
>> On 5/22/25 14:20, Philipp Stanner wrote:
>>> On Thu, 2025-05-22 at 14:06 +0200, Christian König wrote:
>>>> On 5/22/25 13:25, Philipp Stanner wrote:
>>>>> dma_fence_is_signaled_locked(), which is used in
>>>>> nouveau_fence_context_kill(), can signal fences below the
>>>>> surface
>>>>> through a callback.
>>>>>
>>>>> There is neither need for nor use in doing that when killing a
>>>>> fence
>>>>> context.
>>>>>
>>>>> Replace dma_fence_is_signaled_locked() with
>>>>> __dma_fence_is_signaled(), a
>>>>> function which only checks, never signals.
>>>>
>>>> That is not a good approach.
>>>>
>>>> Having the __dma_fence_is_signaled() means that other would be
>>>> allowed to call it as well.
>>>>
>>>> But nouveau can do that here only because it knows that the fence
>>>> was
>>>> issued by nouveau.
>>>>
>>>> What nouveau can to is to test the signaled flag directly, but
>>>> that's
>>>> what you try to avoid as well.
>>>
>>> There's many parties who check the bit already.
>>>
>>> And if Nouveau is allowed to do that, one can just as well provide
>>> a
>>> wrapper for it.
>>
>> No, exactly that's what is usually avoided in cases like this here.
>>
>> See all the functions inside include/linux/dma-fence.h can be used by
>> everybody. It's basically the public interface of the dma_fence
>> object.
>>
>> So testing if a fence is signaled without calling the callback is
>> only allowed by whoever implemented the fence.
> 
> Why?
> 
> See, who owns the callback? -> the driver which emitted the fence. If
> the driver doesn't guarantee that all fences will be signaled, the
> callback (always returning false) doesn't help you in any way.
> 
> I think the issue you're seeing is more that a party that only ever
> checks a fence's state through callbacks (and doesn't signal them
> through interrupts for example) would run danger of fences never
> getting signaled.

Partially correct, yes.

> But that's already the case if someone doesn't implement the callback.

But than this implementation must always signal somehow else.

> The fundamental basis is always the same: The driver must guarantee
> that all fences get signaled. Independently from other users checking
> the fence this or that way, independently from the callback being
> implemented.

Yeah, but it is invalid for a caller to not ask the implementation if it's signaled or not. See the rational behind that is to avoid abuse of the interface.

E.g. when you don't know the implementation side use the defined API and don't mess with the internals. If you do know the implementation side then it's valid that you check the internals.

Regards,
Christian.

> 
>>
>> In other words nouveau can test nouveau fences, i915 can test i915
>> fences, amdgpu can test amdgpu fences etc... But if you have the
>> wrapper that makes it officially allowed that nouveau starts testing
>> i915 fences and that would be problematic.
> 
> I don't see the context here. That applies to the other functions as
> well.
> 
> 
> P.
> 
>>
>> Regards,
>> Christian.
>>
>>>
>>> That has the advantage of centralizing the responsibility and
>>> documenting it.
>>>
>>> P.
>>>
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>
>>>>> Signed-off-by: Philipp Stanner <phasta@...nel.org>
>>>>> ---
>>>>>  drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +-
>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c
>>>>> b/drivers/gpu/drm/nouveau/nouveau_fence.c
>>>>> index d5654e26d5bc..993b3dcb5db0 100644
>>>>> --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
>>>>> +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
>>>>> @@ -88,7 +88,7 @@ nouveau_fence_context_kill(struct
>>>>> nouveau_fence_chan *fctx, int error)
>>>>>  
>>>>>  	spin_lock_irqsave(&fctx->lock, flags);
>>>>>  	list_for_each_entry_safe(fence, tmp, &fctx->pending,
>>>>> head)
>>>>> {
>>>>> -		if (error &&
>>>>> !dma_fence_is_signaled_locked(&fence-
>>>>>> base))
>>>>> +		if (error && !__dma_fence_is_signaled(&fence-
>>>>>> base))
>>>>>  			dma_fence_set_error(&fence->base,
>>>>> error);
>>>>>  
>>>>>  		if (nouveau_fence_signal(fence))
>>>>
>>>
>>
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ