[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <474a335b-c79c-49fc-80b0-39471b8c8286@amd.com>
Date: Fri, 20 Dec 2024 16:25:44 +0100
From: Christian König <christian.koenig@....com>
To: Danilo Krummrich <dakr@...nel.org>, Philipp Stanner <pstanner@...hat.com>
Cc: Philipp Stanner <phasta@...nel.org>, Luben Tuikov <ltuikov89@...il.com>,
Matthew Brost <matthew.brost@...el.com>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
Sumit Semwal <sumit.semwal@...aro.org>, dri-devel@...ts.freedesktop.org,
linux-kernel@...r.kernel.org, linux-media@...r.kernel.org,
linaro-mm-sig@...ts.linaro.org, Tvrtko Ursulin <tursulin@...ulin.net>,
Andrey Grodzovsky <andrey.grodzovsky@....com>
Subject: Re: [PATCH] drm/sched: Document run_job() refcount hazard
Am 20.12.24 um 15:51 schrieb Danilo Krummrich:
> On Fri, Dec 20, 2024 at 03:11:34PM +0100, Philipp Stanner wrote:
>> On Fri, 2024-12-20 at 14:25 +0100, Christian König wrote:
>>> Am 20.12.24 um 14:18 schrieb Danilo Krummrich:
>>>> On Fri, Dec 20, 2024 at 01:53:34PM +0100, Christian König wrote:
>>>>> Am 20.12.24 um 13:45 schrieb Philipp Stanner:
>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> index 7ce25281c74c..d6f8df39d848 100644
>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> + *
>>>>>> + * @sched_job: the job to run
>>>>>> + *
>>>>>> + * Returns: dma_fence the driver must signal once the
>>>>>> hardware has
>>>>>> + * completed the job ("hardware fence").
>>>>>> + *
>>>>>> + * Note that the scheduler expects to 'inherit' its
>>>>>> own reference to
>>>>>> + * this fence from the callback. It does not invoke an
>>>>>> extra
>>>>>> + * dma_fence_get() on it. Consequently, this callback
>>>>>> must return a
>>>>>> + * fence whose refcount is at least 2: One for the
>>>>>> scheduler's
>>>>>> + * reference returned here, another one for the
>>>>>> reference kept by the
>>>>>> + * driver.
>>>>> Well the driver actually doesn't need any extra reference. The
>>>>> scheduler
>>>>> just needs to guarantee that this reference isn't dropped before
>>>>> it is
>>>>> signaled.
>>>> I think he means the reference the driver's fence context has to
>>>> have in order
>>>> to signal that thing eventually.
>>> Yeah, but this is usually a weak reference. IIRC most drivers don't
>>> increment the reference count for the reference they keep to signal a
>>> fence.
>>>
>>> It's expected that the consumers of the dma_fence keep the fence
>>> alive
>>> at least until it is signaled.
>> So are you saying that the driver having an extra reference (without
>> having obtained it with dma_fence_get()) is not an issue because the
>> driver is the one who will signal the fence [and then be done with it]?
> It's never a "real" issue if you have multiple pointers to a reference counted
> object as long as you can ensure that you hold at least one reference for the
> time you have pointers to the object.
Well, I'm not saying that this isn't an issue. I'm just pointing out
that this is the current practice :)
> But, that's bad design. For every pointer to an object a separate reference
> should be taken.
Yeah, completely agree. Weak references are usually a bad idea if you
don't absolutely need them for something.
Regards,
Christian.
Powered by blists - more mailing lists