lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b01f77e2-a885-af0e-ef9b-265e93b2dee0@linux.intel.com>
Date:   Tue, 3 Oct 2023 20:57:29 +0200
From:   Thomas Hellström 
        <thomas.hellstrom@...ux.intel.com>
To:     Danilo Krummrich <dakr@...hat.com>,
        Boris Brezillon <boris.brezillon@...labora.com>
Cc:     airlied@...il.com, daniel@...ll.ch, matthew.brost@...el.com,
        sarah.walker@...tec.com, donald.robson@...tec.com,
        christian.koenig@....com, faith@...strand.net,
        dri-devel@...ts.freedesktop.org, nouveau@...ts.freedesktop.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH drm-misc-next v5 4/6] drm/gpuvm: track/lock/validate
 external/evicted objects


On 10/3/23 18:55, Danilo Krummrich wrote:
> It seems like we're mostly aligned on this series, except for the key
> controversy we're discussing for a few versions now: locking of the internal
> lists. Hence, let's just re-iterate the options we have to get this out of the
> way.
>
> (1) The spinlock dance. This basically works for every use case, updating the VA
>      space from the IOCTL, from the fence signaling path or anywhere else.
>      However, it has the downside of requiring spin_lock() / spin_unlock() for
>      *each* list element when locking all external objects and validating all
>      evicted objects. Typically, the amount of extobjs and evicted objects
>      shouldn't be excessive, but there might be exceptions, e.g. Xe.
>
> (2) The dma-resv lock dance. This is convinient for drivers updating the VA
>      space from a VM_BIND ioctl() and is especially efficient if such drivers
>      have a huge amount of external and/or evicted objects to manage. However,
>      the downsides are that it requires a few tricks in drivers updating the VA
>      space from the fence signaling path (e.g. job_run()). Design wise, I'm still
>      skeptical that it is a good idea to protect internal data structures with
>      external locks in a way that it's not clear to callers that a certain
>      function would access one of those resources and hence needs protection.
>      E.g. it is counter intuitive that drm_gpuvm_bo_put() would require both the
>      dma-resv lock of the corresponding object and the VM's dma-resv lock held.
>      (Additionally, there were some concerns from amdgpu regarding flexibility in
>      terms of using GPUVM for non-VM_BIND uAPIs and compute, however, AFAICS
>      those discussions did not complete and to me it's still unclear why it
>      wouldn't work.)
>
> (3) Simply use an internal mutex per list. This adds a tiny (IMHO negligible)
>      overhead for drivers updating the VA space from a VM_BIND ioctl(), namely
>      a *single* mutex_lock()/mutex_unlock() when locking all external objects
>      and validating all evicted objects. And it still requires some tricks for
>      drivers updating the VA space from the fence signaling path. However, it's
>      as simple as it can be and hence way less error prone as well as
>      self-contained and hence easy to use. Additionally, it's flexible in a way
>      that we don't have any expections on drivers to already hold certain locks
>      that the driver in some situation might not be able to acquire in the first
>      place.
>
> (4) Arbitrary combinations of the above. For instance, the current V5 implements
>      both (1) and (2) (as either one or the other). But also (1) and (3) (as in
>      (1) additionally to (3)) would be an option, where a driver could opt-in for
>      the spinlock dance in case it updates the VA space from the fence signaling
>      path.
>
> I also considered a few other options as well, however, they don't seem to be
> flexible enough. For instance, as by now we could use SRCU for the external
> object list. However, this falls apart once a driver wants to remove and re-add
> extobjs for the same VM_BO instance. (For the same reason it wouldn't work for
> evicted objects.)
>
> Personally, after seeing the weird implications of (1), (2) and a combination of
> both, I tend to go with (3). Optionally, with an opt-in for (1). The reason for
> the latter is that with (3) the weirdness of (1) by its own mostly disappears.
>
> Please let me know what you think, and, of course, other ideas than the
> mentioned ones above are still welcome.
>
> - Danilo
>
Here are the locking principles Daniel put together and Dave once called 
out for us to be applying when reviewing DRM code. These were prompted 
by very fragile and hard to understand locking patterns in the i915 
driver and I think the xe vm_bind locking design was made with these in 
mind, (not sure exactly who wrote what, though so can't say for sure).

https://blog.ffwll.ch/2022/07/locking-engineering.html
https://blog.ffwll.ch/2022/08/locking-hierarchy.html

At least to me, this motivates using the resv design unless we strictly 
need lower level locks that are taken in the eviction paths or userptr 
invalidation paths, but doesn't rule out spinlocks or lock dropping 
tricks where these are really necessary. But pretty much rules out RCU / 
SRCU from what I can tell.

It also calls for documenting how individual members of structs are 
protected when ever possible.

Thanks,
Thomas


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ