[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e9b84f62-f4e4-4c11-a1df-d429f0135cab@damsy.net>
Date: Thu, 13 Nov 2025 17:43:28 +0100
From: Pierre-Eric Pelloux-Prayer <pierre-eric@...sy.net>
To: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>,
Christian König <christian.koenig@....com>,
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@....com>,
Alex Deucher <alexander.deucher@....com>, David Airlie <airlied@...il.com>,
Simona Vetter <simona@...ll.ch>
Cc: amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 2/3] drm/amdgpu: increment sched score on entity
selection
Le 07/11/2025 à 11:39, Tvrtko Ursulin a écrit :
>
> On 07/11/2025 10:26, Christian König wrote:
>> On 11/7/25 10:04, Pierre-Eric Pelloux-Prayer wrote:
>>> For hw engines that can't load balance jobs, entities are
>>> "statically" load balanced: on their first submit, they select
>>> the best scheduler based on its score.
>>> The score is made up of 2 parts:
>>> * the job queue depth (how much jobs are executing/waiting)
>>> * the number of entities assigned
>>>
>>> The second part is only relevant for the static load balance:
>>> it's a way to consider how many entities are attached to this
>>> scheduler, knowing that if they ever submit jobs they will go
>>> to this one.
>>>
>>> For rings that can load balance jobs freely, idle entities
>>> aren't a concern and shouldn't impact the scheduler's decisions.
>>>
>>> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@....com>
>>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 22 +++++++++++++++++-----
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h | 1 +
>>> 2 files changed, 18 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/
>>> amdgpu/amdgpu_ctx.c
>>> index afedea02188d..4d91cbcbcf25 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>> @@ -209,6 +209,7 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx,
>>> u32 hw_ip,
>>> struct amdgpu_ctx_entity *entity;
>>> enum drm_sched_priority drm_prio;
>>> unsigned int hw_prio, num_scheds;
>>> + struct amdgpu_ring *aring;
>>> int32_t ctx_prio;
>>> int r;
>>> @@ -239,11 +240,13 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx
>>> *ctx, u32 hw_ip,
>>> goto error_free_entity;
>>> }
>>> - /* disable load balance if the hw engine retains context among dependent
>>> jobs */
>>> - if (hw_ip == AMDGPU_HW_IP_VCN_ENC ||
>>> - hw_ip == AMDGPU_HW_IP_VCN_DEC ||
>>> - hw_ip == AMDGPU_HW_IP_UVD_ENC ||
>>> - hw_ip == AMDGPU_HW_IP_UVD) {
>>> + sched = scheds[0];
>>> + aring = container_of(sched, struct amdgpu_ring, sched);
>>> +
>>> + if (aring->funcs->engine_retains_context) {
>>> + /* Disable load balancing between multiple schedulers if the hw
>>> + * engine retains context among dependent jobs.
>>> + */
>>> sched = drm_sched_pick_best(scheds, num_scheds);
>>> scheds = &sched;
>>> num_scheds = 1;
>>> @@ -258,6 +261,12 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx
>>> *ctx, u32 hw_ip,
>>> if (cmpxchg(&ctx->entities[hw_ip][ring], NULL, entity))
>>> goto cleanup_entity;
>>> + if (aring->funcs->engine_retains_context) {
>>> + aring = container_of(sched, struct amdgpu_ring, sched);
>>> + entity->sched_score = aring->sched_score;
>>> + atomic_inc(entity->sched_score);
>>> + }
>>> +
>>> return 0;
>>> cleanup_entity:
>>> @@ -514,6 +523,9 @@ static void amdgpu_ctx_do_release(struct kref *ref)
>>> if (!ctx->entities[i][j])
>>> continue;
>>> + if (ctx->entities[i][j]->sched_score)
>>> + atomic_dec(ctx->entities[i][j]->sched_score);
>>> +
>>> drm_sched_entity_destroy(&ctx->entities[i][j]->entity);
>>> }
>>> }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/
>>> amdgpu/amdgpu_ctx.h
>>> index 090dfe86f75b..f7b44f96f374 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
>>> @@ -39,6 +39,7 @@ struct amdgpu_ctx_entity {
>>> uint32_t hw_ip;
>>> uint64_t sequence;
>>> struct drm_sched_entity entity;
>>> + atomic_t *sched_score;
>>
>> I would rather prefer to not have that additional member here.
>>
>> Additional to that we are messing with the internals of the scheduler here and
>> should probably have two clean functions to increase/decrease the score.
The problem of exposing a function to cleanly inc/dec the score, is that it
requires entity->rq to never be set to NULL.
So I'd rather keep the current code / workaround in amdgpu. Or, I could modify
the next one to do something like this:
if (entity->inc_score_on_add)
atomic_inc(rq->sched->score);
= keeping the existing logic in the scheduler, but making it opt-in instead of
the default behavior.
Pierre-Eric
>
> Don't forget it is a driver owned atomic_t so I think it is fine driver
> manipulates it on its own.
>
> Regards,
>
> Tvrtko
>
>
>> Regards,
>> Christian.
>>
>>> struct dma_fence *fences[];
>>> };
>>
Powered by blists - more mailing lists