linux-kernel - Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAKMK7uHEwj6jiZkRZ5PaCUNWcuU9oE4KYm4XHZwHnFzEuChZ7w@mail.gmail.com>
Date:   Fri, 19 Jun 2020 10:51:59 +0200
From:   Daniel Vetter <daniel.vetter@...ll.ch>
To:     Chris Wilson <chris@...is-wilson.co.uk>
Cc:     Daniel Stone <daniel@...ishbar.org>,
        Dave Airlie <airlied@...il.com>,
        linux-rdma <linux-rdma@...r.kernel.org>,
        Intel Graphics Development <intel-gfx@...ts.freedesktop.org>,
        LKML <linux-kernel@...r.kernel.org>,
        DRI Development <dri-devel@...ts.freedesktop.org>,
        "moderated list:DMA BUFFER SHARING FRAMEWORK" 
        <linaro-mm-sig@...ts.linaro.org>,
        Thomas Hellstrom <thomas.hellstrom@...el.com>,
        amd-gfx mailing list <amd-gfx@...ts.freedesktop.org>,
        Daniel Vetter <daniel.vetter@...el.com>,
        Linux Media Mailing List <linux-media@...r.kernel.org>,
        Christian König <christian.koenig@....com>,
        Mika Kuoppala <mika.kuoppala@...el.com>
Subject: Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations

On Fri, Jun 19, 2020 at 10:25 AM Chris Wilson <chris@...is-wilson.co.uk> wrote:
>
> Quoting Daniel Stone (2020-06-11 10:01:46)
> > Hi,
> >
> > On Thu, 11 Jun 2020 at 09:44, Dave Airlie <airlied@...il.com> wrote:
> > > On Thu, 11 Jun 2020 at 18:01, Chris Wilson <chris@...is-wilson.co.uk> wrote:
> > > > Introducing a global lockmap that cannot capture the rules correctly,
> > >
> > > Can you document the rules all drivers should be following then,
> > > because from here it looks to get refactored every version of i915,
> > > and it would be nice if we could all aim for the same set of things
> > > roughly. We've already had enough problems with amdgpu vs i915 vs
> > > everyone else with fences, if this stops that in the future then I'd
> > > rather we have that than just some unwritten rules per driver and
> > > untestable.
> >
> > As someone who has sunk a bunch of work into explicit-fencing
> > awareness in my compositor so I can never be blocked, I'd be
> > disappointed if the infrastructure was ultimately pointless because
> > the documented fencing rules were \_o_/ or thereabouts. Lockdep
> > definitely isn't my area of expertise so I can't comment on the patch
> > per se, but having something to ensure we don't hit deadlocks sure
> > seems a lot better than nothing.
>
> This is doing dependency analysis on execution contexts which is a far
> cry from doing the fence dependency analysis, and so has to actively
> ignore the cycles that must exist on the dma side, and also the cycles
> that prevent entering execution contexts on the CPU. It has to actively
> ignore scheduler execution contexts, for lockdep cries, and so we do not
> get analysis of the locking contexts along that path. This would be
> solvable along the lines of extending lockdep ala lockdep_dma_enter().

drm/scheduler is annotated, found some rather improbably to hit issues
in practice. But from the quick chat I've had with König and others I
think he agrees that it's real at least in the theoretical sense.
Probably should consider playing lottery if you hit it in practice
though :-)

> Had i915's execution flow been marked up, it should have found the
> dubious wait for external fences inside the dead GPU recovery, and
> probably found a few more things to complain about with the reset locking.
> [Note we already do the same annotations for wait-vs-reset, but not
> reset-vs-execution.]

I know it splats, that's why the tdr annotation patch comes with a
spec proposal for lifting the wait busting we do in i915 to the
dma_fence level. I included that because amdgpu has the same problem
on modern hw. Apparently their planned fix (because they've hit this
bug in testing) was to push some shared lock down into their
atomic_comit_tail function and use that in gpu reset, so don't seem
that interested in extending dma_fence.

For i915 it's just gen2/3 display, and cross-driver dma-buf/fence
usage for those is nil and won't change. Pragmatic solution imo would
be to just not annotate gpu reset on these platforms, and relying on
our wait busting plus igt tests to make sure it keeps working as-is.
The point of the explicit annotations for the signalling side is very
much that it can be rolled out gradually, and entirely left out for
old legacy paths that aren't worth fixing.

> Determination of which waits are legal and which are not is entirely ad
> hoc, for there is no status change tracking in the dependency analysis
> [that is once an execution context is linked to a published fence, again
> integral to lockdep.] Consider if the completion chain in atomic is
> swapped out for the morally equivalent fences along intertwined timelines,
> and so it does a bunch of dma_fence_wait() instead. Why are those waits
> legal despite them being after we have committed to fulfilling the out
> fence? [Why are the waits on and for the GPU legal, since they equally
> block execution flow?]

No need to consider, it's already real and resulted in some pretty
splats until I got the recursion handling right.

> Forcing a generic primitive to always be part of the same global map is
> horrible. You forgo being able to use the primitive for unrelated tasks,
> lose the ability to name particular contexts to gain more informative
> dependency cycle reports from having the explicit linkage. You can add
> wait_map tracking without loss of generality [in less than 10 lines],
> and you can still enforce that all fences used for a common purpose
> follow the same rules [the simplest way being to default to the singular
> wait_map]. But it's the explicitly named execution contexts that are the
> biggest boon to reading the code and reading the lockdep warns.

So one thing that's maybe not clear here: This doesn't track the DAG
of dependencies. Doesn't even try, I'm still faithfully assuming
drivers get that part right. Which is a gap and maybe we should fix
this, but not the goal here.

All this does is validate fences against anything else that might be
going on in the system. E.g. your recursion example for atomic is
handled by just assuming that any dma_fence_wait within a signalling
section is legit and correct. We can add this later on, but not with
lockdep, since lockdep works with classes. And proofing that
dma_fences are acyclic requires you track them all as individuals.
Entirely different things.

That still leaves the below:

> Forcing a generic primitive to always be part of the same global map is
> horrible.

And  no concrete example or reason for why that's not possible.
Because frankly it's not horrible, this is what upstream is all about:
Shared concepts, shared contracts, shared code.

The proposed patches might very well encode the wrong contract, that's
all up for discussion. But fundamentally questioning that we need one
is missing what upstream is all about.

> This is a bunch of ad hoc tracking for a very narrow purpose applied
> globally, with loss of information.

It doesn't solve every problem indeed. I'm happy to review patches to
check acyclic-ness of dma-fence at the global level from you, I
haven't figured out yet how to make that happen. I know i915-gem has
that, but this is about the cross-driver contract here.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch