[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YnkaUk0mZNuPsZ5r@phenom.ffwll.local>
Date: Mon, 9 May 2022 15:42:42 +0200
From: Daniel Vetter <daniel@...ll.ch>
To: Dmitry Osipenko <dmitry.osipenko@...labora.com>
Cc: Daniel Stone <daniel@...ishbar.org>,
Thomas Zimmermann <tzimmermann@...e.de>,
David Airlie <airlied@...ux.ie>,
Gerd Hoffmann <kraxel@...hat.com>,
Gurchetan Singh <gurchetansingh@...omium.org>,
Chia-I Wu <olvaffe@...il.com>,
Daniel Almeida <daniel.almeida@...labora.com>,
Gert Wollny <gert.wollny@...labora.com>,
Gustavo Padovan <gustavo.padovan@...labora.com>,
Tomeu Vizoso <tomeu.vizoso@...labora.com>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Maxime Ripard <mripard@...nel.org>,
Rob Herring <robh@...nel.org>,
Steven Price <steven.price@....com>,
Alyssa Rosenzweig <alyssa.rosenzweig@...labora.com>,
Rob Clark <robdclark@...il.com>,
Emil Velikov <emil.l.velikov@...il.com>,
Robin Murphy <robin.murphy@....com>,
Dmitry Osipenko <digetx@...il.com>,
linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
virtualization@...ts.linux-foundation.org
Subject: Re: [PATCH v4 10/15] drm/shmem-helper: Take reservation lock instead
of drm_gem_shmem locks
On Fri, May 06, 2022 at 01:49:12AM +0300, Dmitry Osipenko wrote:
> On 5/5/22 11:12, Daniel Vetter wrote:
> > On Wed, May 04, 2022 at 06:56:09PM +0300, Dmitry Osipenko wrote:
> >> On 5/4/22 11:21, Daniel Vetter wrote:
> >> ...
> >>>>> - Maybe also do what you suggest and keep a separate lock for this, but
> >>>>> the fundamental issue is that this doesn't really work - if you share
> >>>>> buffers both ways with two drivers using shmem helpers, then the
> >>>>> ordering of this vmap_count_mutex vs dma_resv_lock is inconsistent and
> >>>>> you can get some nice deadlocks. So not a great approach (and also the
> >>>>> reason why we really need to get everyone to move towards dma_resv_lock
> >>>>> as _the_ buffer object lock, since otherwise we'll never get a
> >>>>> consistent lock nesting hierarchy).
> >>>>
> >>>> The separate locks should work okay because it will be always the
> >>>> exporter that takes the dma_resv_lock. But I agree that it's less ideal
> >>>> than defining the new rules for dma-bufs since sometime you will take
> >>>> the resv lock and sometime not, potentially hiding bugs related to lockings.
> >>>
> >>> That's the issue, some importers need to take the dma_resv_lock for
> >>> dma_buf_vmap too (e.g. to first nail the buffer in place when it's a
> >>> dynamic memory manager). In practice it'll work as well as what we have
> >>> currently, which is similarly inconsistent, except with per-driver locks
> >>> instead of shared locks from shmem helpers or dma-buf, so less obvious
> >>> that things are inconsistent.
> >>>
> >>> So yeah if it's too messy maybe the approach is to have a separate lock
> >>> for vmap for now, land things, and then fix up dma_buf_vmap in a follow up
> >>> series.
> >>
> >> The amdgpu driver was the fist who introduced the concept of movable
> >> memory for dma-bufs. Now we want to support it for DRM SHMEM too. For
> >> both amdgpu ttm and shmem drivers we will want to hold the reservation
> >> lock when we're touching moveable buffers. The current way of denoting
> >> that dma-buf is movable is to implement the pin/unpin callbacks of the
> >> dma-buf ops, should be doable for shmem.
> >
> > Hm that sounds like a bridge too far? I don't think we want to start
> > adding moveable dma-bufs for shmem, thus far at least no one asked for
> > that. Goal here is just to streamline the locking a bit and align across
> > all the different ways of doing buffers in drm.
> >
> > Or do you mean something else and I'm just completely lost?
>
> I'm talking about aligning DRM locks with the dma-buf locks. The problem
> is that the convention of dma-bufs isn't specified yet. In particular
> there is no convention for the mapping operations.
>
> If we want to switch vmapping of shmem to use reservation lock, then
> somebody will have to hold this lock for dma_buf_vmap() and the locking
> convention needs to be specified firmly.
Ah yes that makes sense.
> In case of dynamic buffers, we will also need to specify whether
> dma_buf_vmap() should imply the implicit pinning by exporter or the
> buffer must be pinned explicitly by importer before dma_buf_vmap() is
> invoked.
>
> Perhaps I indeed shouldn't care about this for this patchset. The
> complete locking model of dma-bufs must be specified first.
Hm I thought vmap is meant to pin itself, and not rely on any other
pinning done already. And from a quick look through the long call chain
for amd (which is currently the only driver supporting dynamic dma-buf)
that seems to be the case.
But yeah the locking isn't specificied yet, and that makes it a bit a mess
:-(
> >> A day ago I found that mapping of imported dma-bufs is broken at least
> >> for the Tegra DRM driver (and likely for others too) because driver
> >> doesn't assume that anyone will try to mmap imported buffer and just
> >> doesn't handle this case at all, so we're getting a hard lockup on
> >> touching mapped memory because we're mapping something else than the
> >> dma-buf.
> >
> > Huh that sounds bad, how does this happen? Pretty much all pieces of
> > dma-buf (cpu vmap, userspace mmap, heck even dma_buf_attach) are optional
> > or at least can fail for various reasons. So exporters not providing mmap
> > support is fine, but importers then dying is not.
>
> Those drivers that die don't have userspace that uses dma-bufs
> extensively. I noticed it only because was looking at this code too much
> for the last days.
>
> Drivers that don't die either map imported BOs properly or don't allow
> mapping at all.
Ah yeah driver bugs as explanation makes sense :-/
> >> My plan is to move the dma-buf management code to the level of DRM core
> >> and make it aware of the reservation locks for the dynamic dma-bufs.
> >> This way we will get the proper locking for dma-bufs and fix mapping of
> >> imported dma-bufs for Tegra and other drivers.
> >
> > So maybe we're completely talking past each another, or coffee is not
> > working here on my end, but I've no idea what you mean.
> >
> > We do have some helpers for taking care of the dma_resv_lock dance, and
> > Christian König has an rfc patch set to maybe unify this further. But that
> > should be fairly orthogonal to reworking shmem (it might help a bit with
> > reworking shmem though).
>
> The reservation lock itself doesn't help much shmem, IMO. It should help
> only in the context of dynamic dma-bufs and today we don't have a need
> in the dynamic shmem dma-bufs.
>
> You were talking about making DRM locks consistent with dma-buf locks,
> so I thought that yours main point of making use of reservation locks
> for shmem is to prepare to the new locking scheme.
>
> I wanted to try to specify the dma-buf locking convention for mapping
> operations because it's missing right now and it should affect how DRM
> should take the reservation locks, but this is not easy to do as I see now.
>
> Could you please point at the Christian's RFC patch? He posted too many
> patches, can't find it :) I'm curious to take a look.
https://lore.kernel.org/dri-devel/20220504074739.2231-1-christian.koenig@amd.com/
Wrt this patch series here I'm wondering whether we could do an interim
solution that side-steps the dma_buf_vmap mess.
- in shmem helpers pin any vmapped buffer (it's how dma-buf works too),
and that pinning would be done under dma_resv_lock (like with other
drivers using dma_resv_lock for bo protection)
- switch over everything else except vmap code to dma_resv_lock, but leave
vmap locking as-is
- shrinker then only needs to trylock dma_resv_trylock in the shrinker,
which can check for pinned buffer and that's good enough to exclude
vmap'ed buffer. And it avoids mixing the vmap locking into the new
shrinker code and driver interfaces.
This still leaves the vmap locking mess as-is, but I think that's a mess
that's orthogonal to shrinker work.
Thoughts?
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Powered by blists - more mailing lists