lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <506C30B6.5020403@vmware.com>
Date:	Wed, 03 Oct 2012 14:33:58 +0200
From:	Thomas Hellstrom <thellstrom@...are.com>
To:	Maarten Lankhorst <maarten.lankhorst@...onical.com>
CC:	jakob@...are.com, dri-devel@...ts.freedesktop.org,
	linaro-mm-sig@...ts.linaro.org, sumit.semwal@...aro.org,
	linux-media@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/5] reservation: cross-device reservation support

I took a quick look on the fencing and added some thoughts on
shared fences:

On 09/28/2012 02:43 PM, Maarten Lankhorst wrote:
> This adds support for a generic reservations framework that can be
> hooked up to ttm and dma-buf and allows easy sharing of reservations
> across devices.
>
> The idea is that a dma-buf and ttm object both will get a pointer
> to a struct reservation_object, which has to be reserved before
> anything is done with the buffer.
>
> Some followup patches are needed in ttm so the lru_lock is no longer
> taken during the reservation step. This makes the lockdep annotation
> patch a lot more useful, and the assumption that the lru lock protects
> atomic removal off the lru list will fail soon, anyway.
>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@...onical.com>
>
> +
> +/**
> + * ticket_commit - commit a reservation with a new fence
> + * @ticket:	[in]	the reservation_ticket returned by
> + * ticket_reserve
> + * @entries:	[in]	a linked list of struct reservation_entry
> + * @fence:	[in]	the fence that indicates completion
> + *
> + * This function will call reservation_ticket_fini, no need
> + * to do it manually.
> + *
> + * This function should be called after a hardware command submission is
> + * completed succesfully. The fence is used to indicate completion of
> + * those commands.
> + */
> +void
> +ticket_commit(struct reservation_ticket *ticket,
> +		  struct list_head *entries, struct fence *fence)
> +{
> +	struct list_head *cur;
> +
> +	if (list_empty(entries))
> +		return;
> +
> +	if (WARN_ON(!fence)) {
> +		ticket_backoff(ticket, entries);
> +		return;
> +	}
> +
> +	list_for_each(cur, entries) {
> +		struct reservation_object *bo;
> +		bool shared;
> +
> +		reservation_entry_get(cur, &bo, &shared);
> +
> +		if (!shared) {
> +			int i;
> +			for (i = 0; i < bo->fence_shared_count; ++i) {
> +				fence_put(bo->fence_shared[i]);
> +				bo->fence_shared[i] = NULL;
> +			}
> +			bo->fence_shared_count = 0;
> +			if (bo->fence_excl)
> +				fence_put(bo->fence_excl);
> +
> +			bo->fence_excl = fence;
I assume here that the validation code has made sure that fences are
either ordered or expired so that "fence" signals *after* all other fences
have signaled.

> +		} else {
> +			if (WARN_ON(bo->fence_shared_count >=
> +				    ARRAY_SIZE(bo->fence_shared))) {
> +				continue;
> +			}
This is bad. Failure to fence a buffer is a catastrophic error that can lead
to pages being reused for other stuff while still being read by the GPU,
and the caller must be informed with an error code and sync on the fence.

I guess this has been discussed previously, but I think it might be more
appropriate with a list of pointers to fences. There is an allocation 
overhead,
for this, but allocation from a mem cache should really be fast enough, and
the list entries can be allocated during ticket_reserve to avoid errors in
the commit code.

> +
> +			bo->fence_shared[bo->fence_shared_count++] = fence;

It might be good if this function had access to a light version of a 
cross-device
struct fence * order_fences(struct fence *a, struct fence *b)
function that can quickly check two fences and determine whether 
signaling one means that the other
one also is signaled. In that case one or more of the shared fences can 
be unreferenced,
putting less pressure on the fence_shared array.
The lightweight version of order_fences is allowed to fail if there is 
no simple
and quick way of ordering them. Could perhaps be added to the fence API.

And (even though not part of the reservation API)
There is a heavyweight version of that cross-device function
int order_fence(struct fence *a, int gpu_engine) needed for the 
validation code
exclusive fencing that
*makes sure* fence a has signaled before the current gpu_engine executes 
its commands.
For some gpu - fence pairs the ordering is done implicitly since they 
share the same command
stream, for some it's possible to insert barriers in the gpu_engine 
command stream
(radeon and nouveau is doing that), and if there is no other way of 
doing it, the code will need to
wait for the fence.

> +		}
> +		fence_get(fence);

Hmm. Perhaps a fence_get(fence, NUM) to avoid a huge number of atomic incs?

> +
> +		object_unreserve(bo, ticket);
> +	}
> +	reservation_ticket_fini(ticket);
> +}
> +EXPORT_SYMBOL(ticket_commit);
>

Thomas

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ