[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4d6b5ab3-da80-8572-4b2e-52c83a9e4f27@amd.com>
Date: Thu, 26 Sep 2019 14:31:20 +0000
From: "Grodzovsky, Andrey" <Andrey.Grodzovsky@....com>
To: Steven Price <steven.price@....com>,
Daniel Vetter <daniel@...ll.ch>,
David Airlie <airlied@...ux.ie>,
"Koenig, Christian" <Christian.Koenig@....com>
CC: "Deucher, Alexander" <Alexander.Deucher@....com>,
Nayan Deshmukh <nayan26deshmukh@...il.com>,
Sharat Masetty <smasetty@...eaurora.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>
Subject: Re: [PATCH] drm: Don't free jobs in wait_event_interruptible()
On 9/26/19 5:41 AM, Steven Price wrote:
> On 25/09/2019 21:09, Grodzovsky, Andrey wrote:
>> On 9/25/19 12:07 PM, Andrey Grodzovsky wrote:
>>> On 9/25/19 12:00 PM, Steven Price wrote:
>>>
>>>> On 25/09/2019 16:56, Grodzovsky, Andrey wrote:
>>>>> On 9/25/19 11:14 AM, Steven Price wrote:
>>>>>
>>>>>> drm_sched_cleanup_jobs() attempts to free finished jobs, however
>>>>>> because
>>>>>> it is called as the condition of wait_event_interruptible() it must
>>>>>> not
>>>>>> sleep. Unfortunately some free callbacks (notably for Panfrost) do
>>>>>> sleep.
>>>>>>
>>>>>> Instead let's rename drm_sched_cleanup_jobs() to
>>>>>> drm_sched_get_cleanup_job() and simply return a job for processing if
>>>>>> there is one. The caller can then call the free_job() callback outside
>>>>>> the wait_event_interruptible() where sleeping is possible before
>>>>>> re-checking and returning to sleep if necessary.
>>>>>>
>>>>>> Signed-off-by: Steven Price <steven.price@....com>
>>>>>> ---
>>>>>> drivers/gpu/drm/scheduler/sched_main.c | 44
>>>>>> ++++++++++++++------------
>>>>>> 1 file changed, 24 insertions(+), 20 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> index 9a0ee74d82dc..0ed4aaa4e6d1 100644
>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> @@ -622,43 +622,41 @@ static void drm_sched_process_job(struct
>>>>>> dma_fence *f, struct dma_fence_cb *cb)
>>>>>> }
>>>>>> /**
>>>>>> - * drm_sched_cleanup_jobs - destroy finished jobs
>>>>>> + * drm_sched_get_cleanup_job - fetch the next finished job to be
>>>>>> destroyed
>>>>>> *
>>>>>> * @sched: scheduler instance
>>>>>> *
>>>>>> - * Remove all finished jobs from the mirror list and destroy them.
>>>>>> + * Returns the next finished job from the mirror list (if there is
>>>>>> one)
>>>>>> + * ready for it to be destroyed.
>>>>>> */
>>>>>> -static void drm_sched_cleanup_jobs(struct drm_gpu_scheduler *sched)
>>>>>> +static struct drm_sched_job *
>>>>>> +drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>>>>>> {
>>>>>> + struct drm_sched_job *job = NULL;
>>>>>> unsigned long flags;
>>>>>> /* Don't destroy jobs while the timeout worker is running */
>>>>>> if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>>>>> !cancel_delayed_work(&sched->work_tdr))
>>>>>> - return;
>>>>>> -
>>>>>> -
>>>>>> - while (!list_empty(&sched->ring_mirror_list)) {
>>>>>> - struct drm_sched_job *job;
>>>>>> + return NULL;
>>>>>> - job = list_first_entry(&sched->ring_mirror_list,
>>>>>> + job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>>> struct drm_sched_job, node);
>>>>>> - if (!dma_fence_is_signaled(&job->s_fence->finished))
>>>>>> - break;
>>>>>> - spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>>> + spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>>> +
>>>>>> + if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>>>>> /* remove job from ring_mirror_list */
>>>>>> list_del_init(&job->node);
>>>>>> - spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>>> -
>>>>>> - sched->ops->free_job(job);
>>>>>> + } else {
>>>>>> + job = NULL;
>>>>>> + /* queue timeout for next job */
>>>>>> + drm_sched_start_timeout(sched);
>>>>>> }
>>>>>> - /* queue timeout for next job */
>>>>>> - spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>>> - drm_sched_start_timeout(sched);
>>>>>> spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>>> + return job;
>>>>>> }
>>>>>> /**
>>>>>> @@ -698,12 +696,18 @@ static int drm_sched_main(void *param)
>>>>>> struct drm_sched_fence *s_fence;
>>>>>> struct drm_sched_job *sched_job;
>>>>>> struct dma_fence *fence;
>>>>>> + struct drm_sched_job *cleanup_job = NULL;
>>>>>> wait_event_interruptible(sched->wake_up_worker,
>>>>>> - (drm_sched_cleanup_jobs(sched),
>>>>>> + (cleanup_job =
>>>>>> drm_sched_get_cleanup_job(sched)) ||
>>>>>> (!drm_sched_blocked(sched) &&
>>>>>> (entity = drm_sched_select_entity(sched))) ||
>>>>>> - kthread_should_stop()));
>>>>>> + kthread_should_stop());
>>>>> Can't we just call drm_sched_cleanup_jobs right here, remove all the
>>>>> conditions in wait_event_interruptible (make it always true) and after
>>>>> drm_sched_cleanup_jobs is called test for all those conditions and
>>>>> return to sleep if they evaluate to false ? drm_sched_cleanup_jobs is
>>>>> called unconditionally inside wait_event_interruptible anyway...
>>>>> This is
>>>>> more of a question to Christian.
>>>> Christian may know better than me, but I think those conditions need to
>>>> be in wait_event_interruptible() to avoid race conditions. If we simply
>>>> replace all the conditions with a literal "true" then
>>>> wait_event_interruptible() will never actually sleep.
>>>>
>>>> Steve
>>> Yes you right, it won't work as I missed that condition is evaluated
>>> as first step in wait_event_interruptible before it sleeps.
>>>
>>> Andrey
>> Another idea - what about still just relocating drm_sched_cleanup_jobs
>> to after wait_event_interruptible and also call it in drm_sched_fini so
>> for the case when it will not be called from drm_sched_main due to
>> conditions not evaluating to true it eventually be called last time
>> from drm_sched_fini. I mean - the refactor looks ok to me but the code
>> becomes somewhat confusing this way to grasp.
>>
>> Andrey
> That sounds similar to my first stab at this[1]. However Christian
> pointed out that it is necessary to also free jobs even if there isn't a
> new one to be scheduled. Otherwise it ends up with the jobs lying around
> until something kicks it.
But if there is no new job to be scheduled then no one will wake up the
sched->wake_up_worker and the condition in wait_event_interruptible is
reevaluated only when the wq head is waked up.
>
> There is also the aspect of queueing the timeout for the next job - this
> is the part that I don't actually understand, but removing it from the
> wait_event_interruptible() invariable seems to cause problems. Hence
> this approach which avoids changing this behaviour. But I welcome input
> from anyone who understands this timeout mechanism!
Not familiar with this problem as before it was done from
wait_event_interruptible it was done from scheduled work fired from
job's completion interrupt and I don't remember any issues with it.
In either case I think both solutions are legit so Reviewed-by: Andrey
Grodzovsky <andrey.grodzovsky@....com>
Andrey
>
> Steve
>
> [1]
> https://lists.freedesktop.org/archives/dri-devel/2019-September/235346.html
Powered by blists - more mailing lists