[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1edd0499-ce5d-45a0-a989-ecb86f726795@igalia.com>
Date: Fri, 23 May 2025 15:58:33 +0100
From: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>
To: phasta@...nel.org, Matthew Brost <matthew.brost@...el.com>,
Danilo Krummrich <dakr@...nel.org>,
Christian König <ckoenig.leichtzumerken@...il.com>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
Sumit Semwal <sumit.semwal@...aro.org>
Cc: dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
linux-media@...r.kernel.org
Subject: Re: [PATCH] drm/sched/tests: Use one lock for fence context
On 22/05/2025 15:06, Philipp Stanner wrote:
> On Wed, 2025-05-21 at 11:24 +0100, Tvrtko Ursulin wrote:
>>
>> On 21/05/2025 11:04, Philipp Stanner wrote:
>>> When the unit tests were implemented, each scheduler job got its
>>> own,
>>> distinct lock. This is not how dma_fence context locking rules are
>>> to be
>>> implemented. All jobs belonging to the same fence context (in this
>>> case:
>>> scheduler) should share a lock for their dma_fences. This is to
>>> comply
>>> to various dma_fence rules, e.g., ensuring that only one fence gets
>>> signaled at a time.
>>>
>>> Use the fence context (scheduler) lock for the jobs.
>>
>> I think for the mock scheduler it works to share the lock, but I
>> don't
>> think see that the commit message is correct. Where do you see the
>> requirement to share the lock? AFAIK fence->lock is a fence lock,
>> nothing more semantically.
>
> This patch is in part to probe a bit with Christian and Danilo to see
> whether we can get a bit more clarity about it.
>
> In many places, notably Nouveau, it's definitely well established
> practice to use one lock for the fctx and all the jobs associated with
> it.
>
>
>>
>> And what does "ensuring that only one fence gets signalled at a time"
>> mean? You mean signal in seqno order?
>
> Yes. But that's related. If jobs' fences can get signaled indepently
> from each other, that might race and screw up ordering. A common lock
> can prevent that.
>
>> Even that is not guaranteed in the
>> contract due opportunistic signalling.
>
> Jobs must be submitted to the hardware in the order they were
> submitted, and, therefore, their fences must be signaled in order. No?
>
> What do you mean by opportunistic signaling?
Our beloved dma_fence_is_signaled(). External caller can signal a fence
before the driver which owns it does.
If you change the commit message to correctly describe it is just a
simplification since there is no need for separate locks I am good with
that. It is a good simplification in that case.
Regards,
Tvrtko
>
>
> P.
>
>
>
>
>>
>> Regards,
>>
>> Tvrtko
>>
>>> Signed-off-by: Philipp Stanner <phasta@...nel.org>
>>> ---
>>> drivers/gpu/drm/scheduler/tests/mock_scheduler.c | 5 ++---
>>> drivers/gpu/drm/scheduler/tests/sched_tests.h | 1 -
>>> 2 files changed, 2 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
>>> b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
>>> index f999c8859cf7..17023276f4b0 100644
>>> --- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
>>> +++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
>>> @@ -64,7 +64,7 @@ static void drm_mock_sched_job_complete(struct
>>> drm_mock_sched_job *job)
>>>
>>> job->flags |= DRM_MOCK_SCHED_JOB_DONE;
>>> list_move_tail(&job->link, &sched->done_list);
>>> - dma_fence_signal(&job->hw_fence);
>>> + dma_fence_signal_locked(&job->hw_fence);
>>> complete(&job->done);
>>> }
>>>
>>> @@ -123,7 +123,6 @@ drm_mock_sched_job_new(struct kunit *test,
>>> job->test = test;
>>>
>>> init_completion(&job->done);
>>> - spin_lock_init(&job->lock);
>>> INIT_LIST_HEAD(&job->link);
>>> hrtimer_setup(&job->timer,
>>> drm_mock_sched_job_signal_timer,
>>> CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
>>> @@ -169,7 +168,7 @@ static struct dma_fence
>>> *mock_sched_run_job(struct drm_sched_job *sched_job)
>>>
>>> dma_fence_init(&job->hw_fence,
>>> &drm_mock_sched_hw_fence_ops,
>>> - &job->lock,
>>> + &sched->lock,
>>> sched->hw_timeline.context,
>>> atomic_inc_return(&sched-
>>>> hw_timeline.next_seqno));
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/tests/sched_tests.h
>>> b/drivers/gpu/drm/scheduler/tests/sched_tests.h
>>> index 27caf8285fb7..fbba38137f0c 100644
>>> --- a/drivers/gpu/drm/scheduler/tests/sched_tests.h
>>> +++ b/drivers/gpu/drm/scheduler/tests/sched_tests.h
>>> @@ -106,7 +106,6 @@ struct drm_mock_sched_job {
>>> unsigned int duration_us;
>>> ktime_t finish_at;
>>>
>>> - spinlock_t lock;
>>> struct dma_fence hw_fence;
>>>
>>> struct kunit *test;
>>
>
Powered by blists - more mailing lists