linux-kernel - Re: [Linaro-mm-sig] [PATCH 2/4] dma-fence: dma-buf synchronization (v8 )

Open Source and information security mailing list archives

Message-ID: <502681AE.9030507@canonical.com>
Date:	Sat, 11 Aug 2012 18:00:46 +0200
From:	Maarten Lankhorst <maarten.lankhorst@...onical.com>
To:	Rob Clark <rob.clark@...aro.org>
CC:	sumit.semwal@...aro.org, linaro-mm-sig@...ts.linaro.org,
	linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
	linux-media@...r.kernel.org
Subject: Re: [Linaro-mm-sig] [PATCH 2/4] dma-fence: dma-buf synchronization
 (v8 )

Hey,

Op 11-08-12 17:14, Rob Clark schreef:
> On Fri, Aug 10, 2012 at 3:29 PM, Daniel Vetter <daniel@...ll.ch> wrote:
>> On Fri, Aug 10, 2012 at 04:57:52PM +0200, Maarten Lankhorst wrote:
>>> A dma-fence can be attached to a buffer which is being filled or consumed
>>> by hw, to allow userspace to pass the buffer without waiting to another
>>> device.  For example, userspace can call page_flip ioctl to display the
>>> next frame of graphics after kicking the GPU but while the GPU is still
>>> rendering.  The display device sharing the buffer with the GPU would
>>> attach a callback to get notified when the GPU's rendering-complete IRQ
>>> fires, to update the scan-out address of the display, without having to
>>> wake up userspace.
>>>
>>> A dma-fence is transient, one-shot deal.  It is allocated and attached
>>> to one or more dma-buf's.  When the one that attached it is done, with
>>> the pending operation, it can signal the fence.
>>>
>>>   + dma_fence_signal()
>>>
>>> The dma-buf-mgr handles tracking, and waiting on, the fences associated
>>> with a dma-buf.
>>>
>>> TODO maybe need some helper fxn for simple devices, like a display-
>>> only drm/kms device which simply wants to wait for exclusive fence to
>>> be signaled, and then attach a non-exclusive fence while scanout is in
>>> progress.
>>>
>>> The one pending on the fence can add an async callback:
>>>   + dma_fence_add_callback()
>>> The callback can optionally be cancelled with remove_wait_queue()
>>>
>>> Or wait synchronously (optionally with timeout or interruptible):
>>>   + dma_fence_wait()
>>>
>>> A default software-only implementation is provided, which can be used
>>> by drivers attaching a fence to a buffer when they have no other means
>>> for hw sync.  But a memory backed fence is also envisioned, because it
>>> is common that GPU's can write to, or poll on some memory location for
>>> synchronization.  For example:
>>>
>>>   fence = dma_buf_get_fence(dmabuf);
>>>   if (fence->ops == &bikeshed_fence_ops) {
>>>     dma_buf *fence_buf;
>>>     dma_bikeshed_fence_get_buf(fence, &fence_buf, &offset);
>>>     ... tell the hw the memory location to wait on ...
>>>   } else {
>>>     /* fall-back to sw sync * /
>>>     dma_fence_add_callback(fence, my_cb);
>>>   }
>>>
>>> On SoC platforms, if some other hw mechanism is provided for synchronizing
>>> between IP blocks, it could be supported as an alternate implementation
>>> with it's own fence ops in a similar way.
>>>
>>> To facilitate other non-sw implementations, the enable_signaling callback
>>> can be used to keep track if a device not supporting hw sync is waiting
>>> on the fence, and in this case should arrange to call dma_fence_signal()
>>> at some point after the condition has changed, to notify other devices
>>> waiting on the fence.  If there are no sw waiters, this can be skipped to
>>> avoid waking the CPU unnecessarily. The handler of the enable_signaling
>>> op should take a refcount until the fence is signaled, then release its ref.
>>>
>>> The intention is to provide a userspace interface (presumably via eventfd)
>>> later, to be used in conjunction with dma-buf's mmap support for sw access
>>> to buffers (or for userspace apps that would prefer to do their own
>>> synchronization).
>> I think the commit message should be cleaned up: Kill the TODO, rip out
>> the bikeshed_fence and otherwise update it to the latest code.
Agreed.

>>> v1: Original
>>> v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided
>>>     that dma-fence didn't need to care about the sw->hw signaling path
>>>     (it can be handled same as sw->sw case), and therefore the fence->ops
>>>     can be simplified and more handled in the core.  So remove the signal,
>>>     add_callback, cancel_callback, and wait ops, and replace with a simple
>>>     enable_signaling() op which can be used to inform a fence supporting
>>>     hw->hw signaling that one or more devices which do not support hw
>>>     signaling are waiting (and therefore it should enable an irq or do
>>>     whatever is necessary in order that the CPU is notified when the
>>>     fence is passed).
>>> v3: Fix locking fail in attach_fence() and get_fence()
>>> v4: Remove tie-in w/ dma-buf..  after discussion w/ danvet and mlankorst
>>>     we decided that we need to be able to attach one fence to N dma-buf's,
>>>     so using the list_head in dma-fence struct would be problematic.
>>> v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager.
>>> v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments
>>>     about checking if fence fired or not. This is broken by design.
>>>     waitqueue_active during destruction is now fatal, since the signaller
>>>     should be holding a reference in enable_signalling until it signalled
>>>     the fence. Pass the original dma_fence_cb along, and call __remove_wait
>>>     in the dma_fence_callback handler, so that no cleanup needs to be
>>>     performed.
>>> v7: [ Maarten Lankhorst ] Set cb->func and only enable sw signaling if
>>>     fence wasn't signaled yet, for example for hardware fences that may
>>>     choose to signal blindly.
>>> v8: [ Maarten Lankhorst ] Tons of tiny fixes, moved __dma_fence_init to
>>>     header and fixed include mess. dma-fence.h now includes dma-buf.h
>>>     All members are now initialized, so kmalloc can be used for
>>>     allocating a dma-fence. More documentation added.
>>>
>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@...onical.com>
>> I like the design of this, and especially that it's rather simple ;-)
>>
>> A few comments to polish the interface, implementation and documentation a
>> bit below.
>>
>>> ---
>>>  Documentation/DocBook/device-drivers.tmpl |    2
>>>  drivers/base/Makefile                     |    2
>>>  drivers/base/dma-fence.c                  |  268 +++++++++++++++++++++++++++++
>>>  include/linux/dma-fence.h                 |  124 +++++++++++++
>>>  4 files changed, 395 insertions(+), 1 deletion(-)
>>>  create mode 100644 drivers/base/dma-fence.c
>>>  create mode 100644 include/linux/dma-fence.h
>>>
>>> diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl
>>> index 7514dbf..36252ac 100644
>>> --- a/Documentation/DocBook/device-drivers.tmpl
>>> +++ b/Documentation/DocBook/device-drivers.tmpl
>>> @@ -126,6 +126,8 @@ X!Edrivers/base/interface.c
>>>       </sect1>
>>>       <sect1><title>Device Drivers DMA Management</title>
>>>  !Edrivers/base/dma-buf.c
>>> +!Edrivers/base/dma-fence.c
>>> +!Iinclude/linux/dma-fence.h
>>>  !Edrivers/base/dma-coherent.c
>>>  !Edrivers/base/dma-mapping.c
>>>       </sect1>
>>> diff --git a/drivers/base/Makefile b/drivers/base/Makefile
>>> index 5aa2d70..6e9f217 100644
>>> --- a/drivers/base/Makefile
>>> +++ b/drivers/base/Makefile
>>> @@ -10,7 +10,7 @@ obj-$(CONFIG_CMA) += dma-contiguous.o
>>>  obj-y                        += power/
>>>  obj-$(CONFIG_HAS_DMA)        += dma-mapping.o
>>>  obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o
>>> -obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o
>>> +obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o dma-fence.o
>>>  obj-$(CONFIG_ISA)    += isa.o
>>>  obj-$(CONFIG_FW_LOADER)      += firmware_class.o
>>>  obj-$(CONFIG_NUMA)   += node.o
>>> diff --git a/drivers/base/dma-fence.c b/drivers/base/dma-fence.c
>>> new file mode 100644
>>> index 0000000..93448e4
>>> --- /dev/null
>>> +++ b/drivers/base/dma-fence.c
>>> @@ -0,0 +1,268 @@
>>> +/*
>>> + * Fence mechanism for dma-buf to allow for asynchronous dma access
>>> + *
>>> + * Copyright (C) 2012 Texas Instruments
>>> + * Author: Rob Clark <rob.clark@...aro.org>
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify it
>>> + * under the terms of the GNU General Public License version 2 as published by
>>> + * the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope that it will be useful, but WITHOUT
>>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>>> + * more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public License along with
>>> + * this program.  If not, see <http://www.gnu.org/licenses/>.
>>> + */
>>> +
>>> +#include <linux/slab.h>
>>> +#include <linux/sched.h>
>>> +#include <linux/export.h>
>>> +#include <linux/dma-fence.h>
>>> +
>>> +/**
>>> + * dma_fence_signal - signal completion of a fence
>>> + * @fence: the fence to signal
>>> + *
>>> + * All registered callbacks will be called directly (synchronously) and
>>> + * all blocked waters will be awoken. This should be always be called on
>>> + * software only fences, or alternatively be called after
>>> + * dma_fence_ops::enable_signaling is called.
>> I think we need to be cleare here when dma_fence_signal can be called:
>> - for a sw-only fence (i.e. created with dma_fence_create)
>>   dma_fence_signal _must_ be called under all circumstances.
>> - for any other fences, dma_fence_signal may be called, but it _must_ be
>>   called once the ->enable_signalling func has been called and return 0
>>   (i.e. success).
>> - it may be called only _once_.

As we discussed on irc it might be beneficial to be able to have it called
twice, the second time would be a noop, however.

>>> + */
>>> +int dma_fence_signal(struct dma_fence *fence)
>>> +{
>>> +     unsigned long flags;
>>> +     int ret = -EINVAL;
>>> +
>>> +     if (WARN_ON(!fence))
>>> +             return -EINVAL;
>>> +
>>> +     spin_lock_irqsave(&fence->event_queue.lock, flags);
>>> +     if (!fence->signaled) {
>>> +             fence->signaled = true;
>>> +             __wake_up_locked_key(&fence->event_queue, TASK_NORMAL,
>>> +                                  &fence->event_queue);
>>> +             ret = 0;
>>> +     } else
>>> +             WARN(1, "Already signaled");
>>> +     spin_unlock_irqrestore(&fence->event_queue.lock, flags);
>>> +
>>> +     return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(dma_fence_signal);
>>> +
>>> +static void release_fence(struct kref *kref)
>>> +{
>>> +     struct dma_fence *fence =
>>> +                     container_of(kref, struct dma_fence, refcount);
>>> +
>>> +     BUG_ON(waitqueue_active(&fence->event_queue));
>>> +
>>> +     if (fence->ops->release)
>>> +             fence->ops->release(fence);
>>> +
>>> +     kfree(fence);
>>> +}
>>> +
>>> +/**
>>> + * dma_fence_put - decreases refcount of the fence
>>> + * @fence:   [in]    fence to reduce refcount of
>>> + */
>>> +void dma_fence_put(struct dma_fence *fence)
>>> +{
>>> +     if (WARN_ON(!fence))
>>> +             return;
>>> +     kref_put(&fence->refcount, release_fence);
>>> +}
>>> +EXPORT_SYMBOL_GPL(dma_fence_put);
>>> +
>>> +/**
>>> + * dma_fence_get - increases refcount of the fence
>>> + * @fence:   [in]    fence to increase refcount of
>>> + */
>>> +void dma_fence_get(struct dma_fence *fence)
>>> +{
>>> +     if (WARN_ON(!fence))
>>> +             return;
>>> +     kref_get(&fence->refcount);
>>> +}
>>> +EXPORT_SYMBOL_GPL(dma_fence_get);
>>> +
>>> +static int check_signaling(struct dma_fence *fence)
>>> +{
>>> +     bool enable_signaling = false, signaled;
>>> +     unsigned long flags;
>>> +
>>> +     spin_lock_irqsave(&fence->event_queue.lock, flags);
>>> +     signaled = fence->signaled;
>>> +     if (!signaled && !fence->needs_sw_signal)
>>> +             enable_signaling = fence->needs_sw_signal = true;
>>> +     spin_unlock_irqrestore(&fence->event_queue.lock, flags);
>>> +
>>> +     if (enable_signaling) {
>>> +             int ret;
>>> +
>>> +             /* At this point, if enable_signaling returns any error
>>> +              * a wakeup has to be performanced regardless.
>>> +              * -ENOENT signals fence was already signaled. Any other error
>>> +              * inidicates a catastrophic hardware error.
>>> +              *
>>> +              * If any hardware error occurs, nothing can be done against
>>> +              * it, so it's treated like the fence was already signaled.
>>> +              * No synchronization can be performed, so we have to assume
>>> +              * the fence was already signaled.
>>> +              */
>>> +             ret = fence->ops->enable_signaling(fence);
>>> +             if (ret) {
>>> +                     signaled = true;
>>> +                     dma_fence_signal(fence);
>> I think we should call dma_fence_signal only for -ENOENT and pass all
>> other errors back as-is. E.g. on -ENOMEM or so we might want to retry ...

Also discussed on irc, boolean might be a better solution until we start dealing with
hardware on fire. :) This would however likely be dealt in the same way as signaling,
however.

>>> +             }
>>> +     }
>>> +
>>> +     if (!signaled)
>>> +             return 0;
>>> +     else
>>> +             return -ENOENT;
>>> +}
>>> +
>>> +static int
>>> +__dma_fence_wake_func(wait_queue_t *wait, unsigned mode, int flags, void *key)
>>> +{
>>> +     struct dma_fence_cb *cb =
>>> +                     container_of(wait, struct dma_fence_cb, base);
>>> +
>>> +     __remove_wait_queue(key, wait);
>>> +     return cb->func(cb, wait->private);
>>> +}
>>> +
>>> +/**
>>> + * dma_fence_add_callback - add a callback to be called when the fence
>>> + * is signaled
>>> + *
>>> + * @fence:   [in]    the fence to wait on
>>> + * @cb:              [in]    the callback to register
>>> + * @func:    [in]    the function to call
>>> + * @priv:    [in]    the argument to pass to function
>>> + *
>>> + * cb will be initialized by dma_fence_add_callback, no initialization
>>> + * by the caller is required. Any number of callbacks can be registered
>>> + * to a fence, but a callback can only be registered to one fence at a time.
>>> + *
>>> + * Note that the callback can be called from an atomic context.  If
>>> + * fence is already signaled, this function will return -ENOENT (and
>>> + * *not* call the callback)
>>> + */
>>> +int dma_fence_add_callback(struct dma_fence *fence, struct dma_fence_cb *cb,
>>> +                        dma_fence_func_t func, void *priv)
>>> +{
>>> +     unsigned long flags;
>>> +     int ret;
>>> +
>>> +     if (WARN_ON(!fence || !func))
>>> +             return -EINVAL;
>>> +
>>> +     ret = check_signaling(fence);
>>> +
>>> +     spin_lock_irqsave(&fence->event_queue.lock, flags);
>>> +     if (!ret && fence->signaled)
>>> +             ret = -ENOENT;
>> The locking here is a bit suboptimal: We grab the fence spinlock once in
>> check_signalling and then again here. We should combine this into one
>> critical section.
> Fwiw, Maarten had the same thought.  I had suggested keep it
> clean/simple for now and get it working, and then go back and optimize
> after, so you can blame this one on me :-P
>
> I guess we could either just inline the check_signaling() code, but I
> didn't want to do that yet.  Or we could call check_signaling() with
> the lock already hand, and just drop and re-acquire it around the
> relatively infrequent enable_signaling() callback.

There's nothing that would prevent us from doing it in 1 go and do
enable_signaling after adding the callback. As danvet pointed out on irc,
dma_fence_wait has to be reworked to remove a race condition anyway.

>
>>> +
>>> +     if (!ret) {
>>> +             cb->base.flags = 0;
>>> +             cb->base.func = __dma_fence_wake_func;
>>> +             cb->base.private = priv;
>>> +             cb->fence = fence;
>>> +             cb->func = func;
>>> +             __add_wait_queue(&fence->event_queue, &cb->base);
>>> +     }
>>> +     spin_unlock_irqrestore(&fence->event_queue.lock, flags);
>>> +
>>> +     return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(dma_fence_add_callback);
>> I think for api completenes we should also have a
>> dma_fence_remove_callback function.
> We did originally but Maarten found it was difficult to deal with
> properly when the gpu's hang.  I think his alternative was just to
> require the hung driver to signal the fence.  I had kicked around the
> idea of a dma_fence_cancel() alternative to signal that could pass an
> error thru to the waiting driver.. although not sure if the other
> driver could really do anything differently at that point.

No, there is a very real reason I removed dma_fence_remove_callback. It is
absolutely non-trivial to cancel it once added, since you have to deal with
all kinds of race conditions.. See i915_gem_reset_requests in my git tree:
http://cgit.freedesktop.org/~mlankhorst/linux/commit/?id=673c4b2550bc63ec134bca47a95dabd617a689ce

This is the only way to do it completely deadlock/memory corruption free
since you essentially have a locking inversion to avoid. I had it wrong
the first 2 times too, even when I knew about a lot of the locking
complications. If you want to do it, in most cases it will likely be easier
to just eat the signal and ignore it instead of canceling.

>>> +
>>> +/**
>>> + * dma_fence_wait - wait for a fence to be signaled
>>> + *
>>> + * @fence:   [in]    The fence to wait on
>>> + * @intr:    [in]    if true, do an interruptible wait
>>> + * @timeout: [in]    absolute time for timeout, in jiffies.
>> I don't quite like this, I think we should keep the styl of all other
>> wait_*_timeout functions and pass the arg as timeout in jiffies (and also
>> the same return semantics). Otherwise well have funny code that needs to
>> handle return values differently depending upon whether it waits upon a
>> dma_fence or a native object (where it would us the wait_*_timeout
>> functions directly).
> We did start out this way, but there was an ugly jiffies roll-over
> problem that was difficult to deal with properly.  Using an absolute
> time avoided the problem.
Yeah, this makes it easier to wait on multiple fences, instead of
resetting the timeout over and over and over again, or manually
recalculating.

>> Also, I think we should add the non-_timeout variants, too, just for
>> completeness.
Would it be ok if timeout == 0 is special, then?

>>> + *
>>> + * Returns 0 on success, -EBUSY if a timeout occured,
>>> + * -ERESTARTSYS if the wait was interrupted by a signal.
>>> + */
>>> +int dma_fence_wait(struct dma_fence *fence, bool intr, unsigned long timeout)
>>> +{
>>> +     unsigned long cur;
>>> +     int ret;
>>> +
>>> +     if (WARN_ON(!fence))
>>> +             return -EINVAL;
>>> +
>>> +     cur = jiffies;
>>> +     if (time_after_eq(cur, timeout))
>>> +             return -EBUSY;
>>> +
>>> +     timeout -= cur;
>>> +
>>> +     ret = check_signaling(fence);
>>> +     if (ret == -ENOENT)
>>> +             return 0;
>>> +     else if (ret)
>>> +             return ret;
>>> +
>>> +     if (intr)
>>> +             ret = wait_event_interruptible_timeout(fence->event_queue,
>>> +                                                    fence->signaled,
>>> +                                                    timeout);
>> We have a race here, since fence->signaled is proctected by
>> fenc->event_queu.lock. There's a special variant of the wait_event macros
>> that automatically drops a spinlock at the right time, which would fit
>> here. Again, like for the callback function I think you then need to
>> open-code check_signalling to avoid taking the spinlock twice.
> yeah, this would work for the
> call-check_signaling()-with-lock-already-held approach to get rid of
> the double lock..
Ok.

>>> +     else
>>> +             ret = wait_event_timeout(fence->event_queue,
>>> +                             fence->signaled, timeout);
>>> +
>>> +     if (ret > 0)
>>> +             return 0;
>>> +     else if (!ret)
>>> +             return -EBUSY;
>>> +     else
>>> +             return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(dma_fence_wait);
>>> +
>>> +static int sw_enable_signaling(struct dma_fence *fence)
>>> +{
>>> +     /* dma_fence_create sets needs_sw_signal,
>>> +      * so this should never be called
>>> +      */
>>> +     WARN_ON_ONCE(1);
>>> +     return 0;
>>> +}
>>> +
>>> +static const struct dma_fence_ops sw_fence_ops = {
>>> +     .enable_signaling = sw_enable_signaling,
>>> +};
>>> +
>>> +/**
>>> + * dma_fence_create - create a simple sw-only fence
>>> + * @priv:    [in]    the value to use for the priv member
>>> + *
>>> + * This fence only supports signaling from/to CPU.  Other implementations
>>> + * of dma-fence can be used to support hardware to hardware signaling, if
>>> + * supported by the hardware, and use the dma_fence_helper_* functions for
>>> + * compatibility with other devices that only support sw signaling.
>>> + */
>>> +struct dma_fence *dma_fence_create(void *priv)
>>> +{
>>> +     struct dma_fence *fence;
>>> +
>>> +     fence = kmalloc(sizeof(struct dma_fence), GFP_KERNEL);
>>> +     if (!fence)
>>> +             return NULL;
>>> +
>>> +     __dma_fence_init(fence, &sw_fence_ops, priv);
>>> +     fence->needs_sw_signal = true;
>>> +
>>> +     return fence;
>>> +}
>>> +EXPORT_SYMBOL_GPL(dma_fence_create);
>>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>>> new file mode 100644
>>> index 0000000..e0ceddd
>>> --- /dev/null
>>> +++ b/include/linux/dma-fence.h
>>> @@ -0,0 +1,124 @@
>>> +/*
>>> + * Fence mechanism for dma-buf to allow for asynchronous dma access
>>> + *
>>> + * Copyright (C) 2012 Texas Instruments
>>> + * Author: Rob Clark <rob.clark@...aro.org>
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify it
>>> + * under the terms of the GNU General Public License version 2 as published by
>>> + * the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope that it will be useful, but WITHOUT
>>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>>> + * more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public License along with
>>> + * this program.  If not, see <http://www.gnu.org/licenses/>.
>>> + */
>>> +
>>> +#ifndef __DMA_FENCE_H__
>>> +#define __DMA_FENCE_H__
>>> +
>>> +#include <linux/err.h>
>>> +#include <linux/list.h>
>>> +#include <linux/wait.h>
>>> +#include <linux/list.h>
>>> +#include <linux/dma-buf.h>
>>> +
>>> +struct dma_fence;
>>> +struct dma_fence_ops;
>>> +struct dma_fence_cb;
>>> +
>>> +/**
>>> + * struct dma_fence - software synchronization primitive
>>> + * @refcount: refcount for this fence
>>> + * @ops: dma_fence_ops associated with this fence
>>> + * @priv: fence specific private data
>>> + * @event_queue: event queue used for signaling fence
>>> + * @signaled: whether this fence has been completed yet
>>> + * @needs_sw_signal: whether dma_fence_ops::enable_signaling
>>> + *                   has been called yet
>>> + *
>>> + * Read Documentation/dma-buf-synchronization.txt for usage.
>>> + */
>>> +struct dma_fence {
>>> +     struct kref refcount;
>>> +     const struct dma_fence_ops *ops;
>>> +     wait_queue_head_t event_queue;
>>> +     void *priv;
>>> +     bool signaled:1;
>>> +     bool needs_sw_signal:1;
>> I guess a comment here is in order that signaled and needs_sw_signal is
>> protected by event_queue.lock. Also, since the compiler is rather free to
>> do crazy stuff with bitfields, I think it's preferred style to use an
>> unsigned long and explicit bit #defines (ton ensure the compiler doesn't
>> generate loads/stores that leak to other members of the struct).
> yeah, good point.. I guess we should just change that to be a
> 'unsigned long' bitmask.
>
> BR,
> -R
+1
>>> +};
>>> +
>>> +typedef int (*dma_fence_func_t)(struct dma_fence_cb *cb, void *priv);
>>> +
>>> +/**
>>> + * struct dma_fence_cb - callback for dma_fence_add_callback
>>> + * @base: wait_queue_t added to event_queue
>>> + * @func: dma_fence_func_t to call
>>> + * @fence: fence this dma_fence_cb was used on
>>> + *
>>> + * This struct will be initialized by dma_fence_add_callback, additional
>>> + * data can be passed along by embedding dma_fence_cb in another struct.
>>> + */
>>> +struct dma_fence_cb {
>>> +     wait_queue_t base;
>>> +     dma_fence_func_t func;
>>> +     struct dma_fence *fence;
>>> +};
>>> +
>>> +/**
>>> + * struct dma_fence_ops - operations implemented for dma-fence
>>> + * @enable_signaling: enable software signaling of fence
>>> + * @release: [optional] called on destruction of fence
>>> + *
>>> + * Notes on enable_signaling:
>>> + * For fence implementations that have the capability for hw->hw
>>> + * signaling, they can implement this op to enable the necessary
>>> + * irqs, or insert commands into cmdstream, etc.  This is called
>>> + * in the first wait() or add_callback() path to let the fence
>>> + * implementation know that there is another driver waiting on
>>> + * the signal (ie. hw->sw case).
>>> + *
>>> + * A return value of -ENOENT will indicate that the fence has
>>> + * already passed. Any other errors will be treated as -ENOENT,
>>> + * and can happen because of hardware failure.
>>> + */
>>> +
>> I think we need to specify the calling contexts of these two.
>>
>>> +struct dma_fence_ops {
>>> +     int (*enable_signaling)(struct dma_fence *fence);
>> I think we should mandate that enable_signalling can be called from atomic
>> context, but not irq context (since I don't see a use-case for calling
>> this from irq context).

What would not having this called from irq context get you? I do agree
that you would be crazy to do so, but not sure why we should restrict it.

>>> +     void (*release)(struct dma_fence *fence);
>> Since a waiter might call ->release as a reaction to a signal, I think the
>> release callback must be able to handle any calling context, and
>> especially anything that calls dma_fence_signal.
Agreed. It is also the most likely case it will be called from irq context.

>>> +};
>>> +
>>> +struct dma_fence *dma_fence_create(void *priv);
>>> +
>>> +/**
>>> + * __dma_fence_init - Initialize a custom dma_fence.
>>> + * @fence:   [in]    The fence to initialize
>>> + * @ops:     [in]    The dma_fence_ops for operations on this fence.
>>> + * @priv:    [in]    The value to use for the priv member.
>>> + */
>>> +static inline void
>>> +__dma_fence_init(struct dma_fence *fence,
>>> +              const struct dma_fence_ops *ops, void *priv)
>>> +{
>>> +     WARN_ON(!ops || !ops->enable_signaling);
>>> +
>>> +     kref_init(&fence->refcount);
>>> +     fence->ops = ops;
>>> +     fence->priv = priv;
>>> +     fence->needs_sw_signal = false;
>>> +     fence->signaled = false;
>>> +     init_waitqueue_head(&fence->event_queue);
>>> +}
>>> +
>>> +void dma_fence_get(struct dma_fence *fence);
>>> +void dma_fence_put(struct dma_fence *fence);
>>> +
>>> +int dma_fence_signal(struct dma_fence *fence);
>>> +int dma_fence_wait(struct dma_fence *fence, bool intr, unsigned long timeout);
>>> +int dma_fence_add_callback(struct dma_fence *fence, struct dma_fence_cb *cb,
>>> +                        dma_fence_func_t func, void *priv);
>>> +
>>> +#endif /* __DMA_FENCE_H__ */
>>>
>>>
>>> _______________________________________________
>>> Linaro-mm-sig mailing list
>>> Linaro-mm-sig@...ts.linaro.org
>>> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig
>> --
>> Daniel Vetter
>> Mail: daniel@...ll.ch
>> Mobile: +41 (0)79 365 57 48
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@...ts.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
~Maarten
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives