[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAKMK7uHOcH+rWhor7zzqqcjCUtxz_-5stLAOVD=4_ED+QjN8oQ@mail.gmail.com>
Date: Thu, 4 Jun 2020 11:36:24 +0200
From: Daniel Vetter <daniel.vetter@...ll.ch>
To: Chris Wilson <chris@...is-wilson.co.uk>
Cc: Thomas Hellström <thomas_os@...pmail.org>,
linux-rdma <linux-rdma@...r.kernel.org>,
Intel Graphics Development <intel-gfx@...ts.freedesktop.org>,
LKML <linux-kernel@...r.kernel.org>,
amd-gfx list <amd-gfx@...ts.freedesktop.org>,
"moderated list:DMA BUFFER SHARING FRAMEWORK"
<linaro-mm-sig@...ts.linaro.org>,
Thomas Hellstrom <thomas.hellstrom@...el.com>,
DRI Development <dri-devel@...ts.freedesktop.org>,
Daniel Vetter <daniel.vetter@...el.com>,
Mika Kuoppala <mika.kuoppala@...el.com>,
Christian König <christian.koenig@....com>,
"open list:DMA BUFFER SHARING FRAMEWORK"
<linux-media@...r.kernel.org>
Subject: Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations
On Thu, Jun 4, 2020 at 11:27 AM Chris Wilson <chris@...is-wilson.co.uk> wrote:
>
> Quoting Daniel Vetter (2020-06-04 10:21:46)
> > On Thu, Jun 4, 2020 at 10:57 AM Thomas Hellström (Intel)
> > <thomas_os@...pmail.org> wrote:
> > >
> > >
> > > On 6/4/20 10:12 AM, Daniel Vetter wrote:
> > > ...
> > > > Thread A:
> > > >
> > > > mutex_lock(A);
> > > > mutex_unlock(A);
> > > >
> > > > dma_fence_signal();
> > > >
> > > > Thread B:
> > > >
> > > > mutex_lock(A);
> > > > dma_fence_wait();
> > > > mutex_unlock(A);
> > > >
> > > > Thread B is blocked on A signalling the fence, but A never gets around
> > > > to that because it cannot acquire the lock A.
> > > >
> > > > Note that dma_fence_wait() is allowed to be nested within
> > > > dma_fence_begin/end_signalling sections. To allow this to happen the
> > > > read lock needs to be upgraded to a write lock, which means that any
> > > > other lock is acquired between the dma_fence_begin_signalling() call and
> > > > the call to dma_fence_wait(), and still held, this will result in an
> > > > immediate lockdep complaint. The only other option would be to not
> > > > annotate such calls, defeating the point. Therefore these annotations
> > > > cannot be sprinkled over the code entirely mindless to avoid false
> > > > positives.
> > >
> > > Just realized, isn't that example actually a true positive, or at least
> > > a great candidate for a true positive, since if another thread reenters
> > > that signaling path, it will block on that mutex, and the fence would
> > > never be signaled unless there is another signaling path?
> >
> > Not sure I understand fully, but I think the answer is "it's complicated".
>
> See cd8084f91c02 ("locking/lockdep: Apply crossrelease to completions")
>
> dma_fence usage here is nothing but another name for a completion.
Quoting from my previous cover letter:
"I've dragged my feet for years on this, hoping that cross-release lockdep
would do this for us, but well that never really happened unfortunately.
So here we are."
I discussed this with Peter, cross-release not getting in is pretty
final it seems. The trouble is false positives without explicit
begin/end annotations reviewed by humans - ime from just these few
examples you just can't guess this stuff by computeres, you need real
brains thinking about all the edge cases, and where exactly the
critical section starts and ends. Without that you're just going to
drown in a sea of false positives and yuck.
So yeah I had hopes for cross-release too, unfortunately that was
entirely in vain and a distraction.
Now I guess it would be nice if there's a per-class
completion_begin/end annotation for the more generic problem. But then
also most people don't have a cross-driver completion api contract
like dma_fence is, with some of the most ridiculous over the top
constraints of what's possible and what's not possible on each side of
the cross-release. We do have a bit an outsized benefit (in pain
reduction) vs cost ratio here.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
Powered by blists - more mailing lists