[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53CF699D.9070902@canonical.com>
Date: Wed, 23 Jul 2014 09:51:57 +0200
From: Maarten Lankhorst <maarten.lankhorst@...onical.com>
To: Christian König <christian.koenig@....com>,
Daniel Vetter <daniel.vetter@...ll.ch>,
Christian König <deathsimple@...afone.de>
CC: Thomas Hellstrom <thellstrom@...are.com>,
nouveau <nouveau@...ts.freedesktop.org>,
LKML <linux-kernel@...r.kernel.org>,
dri-devel <dri-devel@...ts.freedesktop.org>,
Ben Skeggs <bskeggs@...hat.com>,
"Deucher, Alexander" <alexander.deucher@....com>
Subject: Re: [Nouveau] [PATCH 09/17] drm/radeon: use common fence implementation
for fences
op 23-07-14 09:37, Christian König schreef:
> Am 23.07.2014 09:31, schrieb Daniel Vetter:
>> On Wed, Jul 23, 2014 at 9:26 AM, Christian König
>> <deathsimple@...afone.de> wrote:
>>> It's not a locking problem I'm talking about here. Radeons lockup handling
>>> kicks in when anything calls into the driver from the outside, if you have a
>>> fence wait function that's called from the outside but doesn't handle
>>> lockups you essentially rely on somebody else calling another radeon
>>> function for the lockup to be resolved.
>> So you don't have a timer in radeon that periodically checks whether
>> progress is still being made? That's the approach we're using in i915,
>> together with some tricks to kick any stuck waiters so that we can
>> reliably step in and grab locks for the reset.
>
> We tried this approach, but it didn't worked at all.
>
> I already considered trying it again because of the upcoming fence implementation, but reconsidering that when a driver is forced to change it's handling because of the fence implementation that's just another hint that there is something wrong here.
As far as I can tell it wouldn't need to be reworked for the fence implementation currently, only the moment you want to allow callers outside of radeon. :-)
Doing a GPU lockup recovery in the wait function would be messy even right now, you would hit a deadlock in ttm_bo_delayed_delete -> ttm_bo_cleanup_refs_and_unlock.
Regardless of the fence implementation, why would it be a good idea to do a full lockup recovery when some other driver is
calling your wait function? That doesn't seem to be a nice thing to do, so I think a timeout is the best error you could return here,
other drivers have to deal with that anyway.
~Maarten
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists