lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <aca00cb25b813da4fd2f215829f02337f05642f3.camel@mailbox.org>
Date: Fri, 11 Apr 2025 14:44:18 +0200
From: Philipp Stanner <phasta@...lbox.org>
To: Christian König <christian.koenig@....com>, 
	phasta@...nel.org, Lyude Paul <lyude@...hat.com>, Danilo Krummrich
	 <dakr@...nel.org>, David Airlie <airlied@...il.com>, Simona Vetter
	 <simona@...ll.ch>, Sabrina Dubroca <sd@...asysnail.net>, Sumit Semwal
	 <sumit.semwal@...aro.org>
Cc: dri-devel@...ts.freedesktop.org, nouveau@...ts.freedesktop.org, 
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org, 
	linux-media@...r.kernel.org, linaro-mm-sig@...ts.linaro.org, 
	stable@...r.kernel.org
Subject: Re: [PATCH 1/3] drm/nouveau: Prevent signaled fences in pending list

On Fri, 2025-04-11 at 13:05 +0200, Christian König wrote:
>  Am 11.04.25 um 11:29 schrieb Philipp Stanner:
>  
> > [SNIP]
> >  
> > It could be, however, that at the same moment
> > nouveau_fence_signal() is
> > removing that entry, holding the appropriate lock.
> > 
> > So we have a race. Again.
> >  
>  
>  Ah, yes of course. If signaled is called with or without the lock is
> actually undetermined.
>  
>  
> >  
> > You see, fixing things in Nouveau is difficult :)
> > It gets more difficult if you want to clean it up "properly", so it
> > conforms to rules such as those from dma_fence.
> > 
> > I have now provided two fixes that both work, but you are not
> > satisfied
> > with from the dma_fence-maintainer's perspective. I understand
> > that,
> > but please also understand that it's actually not my primary task
> > to
> > work on Nouveau. I just have to fix this bug to move on with my
> > scheduler work.
> >  
>  
>  Well I'm happy with whatever solution as long as it works, but as
> far as I can see the approach with the callback simply doesn't.
>  
>  You just can't drop the fence reference for the list from the
> callback.
>  
>  
> >  
> > So if you have another idea, feel free to share it. But I'd like to
> > know how we can go on here.
> >  
>  
>  Well the fence code actually works, doesn't it? The problem is
> rather that setting the error throws a warning because it doesn't
> expect signaled fences on the pending list.
>  
>  Maybe we should fix that instead.

The fence code works as the author intended, but I would be happy if it
were more explicitly documented.

Regarding the WARN_ON: It occurs in dma_fence_set_error() because there
is an attempt to set an error code on a signaled fence. I don't think
that should be "fixed", it works as intended: You must not set an error
code of a fence that was already signaled.

The reason seems to be that once a fence is signaled, a third party
might evaluate the error code.

But I think this wasn't wat you meant with "fix".

In any case, there must not be signaled fences in nouveau's pending-
list. They must be removed immediately once they signal, and this must
not race.

>  
>  
> >  
> > I'm running out of ideas. What I'm wondering if we couldn't just
> > remove
> > performance hacky fastpath functions such as
> > nouveau_fence_is_signaled() completely. It seems redundant to me.
> >  
>  
>  That would work for me as well.

I'll test this approach. Seems a bit like the nuclear approach, but if
it works we'd at least clean up a lot of this mess.


P.


>  
>  
> >  
> > 
> > Or we might add locking to it, but IDK what was achieved with RCU
> > here.
> > In any case it's definitely bad that Nouveau has so many redundant
> > and
> > half-redundant mechanisms.
> >  
>  
>  Yeah, agree messing with the locks even more won't help us here.
>  
>  Regards,
>  Christian.
>  
>  
> >  
> > 
> > 
> > P.
> > 
> >  
> > >  
> > > 
> > > P.
> > > 
> > >  
> > > >  
> > > > Regards,
> > > > Christian.
> > > > 
> > > >  
> > > > >  
> > > > > P.
> > > > > 
> > > > > 
> > > > > 
> > > > >  
> > > > > >  
> > > > > > Regards,
> > > > > > Christian.
> > > > > > 
> > > > > >  
> > > > > > >  
> > > > > > > Replace the call to dma_fence_is_signaled() with
> > > > > > > nouveau_fence_base_is_signaled().
> > > > > > > 
> > > > > > > Cc: <stable@...r.kernel.org> # 4.10+, precise commit not
> > > > > > > to
> > > > > > > be
> > > > > > > determined
> > > > > > > Signed-off-by: Philipp Stanner <phasta@...nel.org>
> > > > > > > ---
> > > > > > >  drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +-
> > > > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c
> > > > > > > b/drivers/gpu/drm/nouveau/nouveau_fence.c
> > > > > > > index 7cc84472cece..33535987d8ed 100644
> > > > > > > --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
> > > > > > > +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
> > > > > > > @@ -274,7 +274,7 @@ nouveau_fence_done(struct
> > > > > > > nouveau_fence
> > > > > > > *fence)
> > > > > > >  			nvif_event_block(&fctx->event);
> > > > > > >  		spin_unlock_irqrestore(&fctx->lock,
> > > > > > > flags);
> > > > > > >  	}
> > > > > > > -	return dma_fence_is_signaled(&fence->base);
> > > > > > > +	return test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
> > > > > > > &fence-
> > > > > > >  
> > > > > > > >  
> > > > > > > > base.flags);
> > > > > > > >  
> > > > > > >  
> > > > > > >  }
> > > > > > >  
> > > > > > >  static long
> > > > > > >  
> > > > > >  
> > > > >  
> > > >  
> > >   
> >   
>  
>  


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ