[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9950dd13-d5c1-4b34-b3f9-2528a1ffb989@igalia.com>
Date: Fri, 7 Nov 2025 10:39:26 +0000
From: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>
To: Christian König <christian.koenig@....com>,
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@....com>,
Alex Deucher <alexander.deucher@....com>, David Airlie <airlied@...il.com>,
Simona Vetter <simona@...ll.ch>
Cc: amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 2/3] drm/amdgpu: increment sched score on entity
selection
On 07/11/2025 10:26, Christian König wrote:
> On 11/7/25 10:04, Pierre-Eric Pelloux-Prayer wrote:
>> For hw engines that can't load balance jobs, entities are
>> "statically" load balanced: on their first submit, they select
>> the best scheduler based on its score.
>> The score is made up of 2 parts:
>> * the job queue depth (how much jobs are executing/waiting)
>> * the number of entities assigned
>>
>> The second part is only relevant for the static load balance:
>> it's a way to consider how many entities are attached to this
>> scheduler, knowing that if they ever submit jobs they will go
>> to this one.
>>
>> For rings that can load balance jobs freely, idle entities
>> aren't a concern and shouldn't impact the scheduler's decisions.
>>
>> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@....com>
>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 22 +++++++++++++++++-----
>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h | 1 +
>> 2 files changed, 18 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> index afedea02188d..4d91cbcbcf25 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> @@ -209,6 +209,7 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
>> struct amdgpu_ctx_entity *entity;
>> enum drm_sched_priority drm_prio;
>> unsigned int hw_prio, num_scheds;
>> + struct amdgpu_ring *aring;
>> int32_t ctx_prio;
>> int r;
>>
>> @@ -239,11 +240,13 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
>> goto error_free_entity;
>> }
>>
>> - /* disable load balance if the hw engine retains context among dependent jobs */
>> - if (hw_ip == AMDGPU_HW_IP_VCN_ENC ||
>> - hw_ip == AMDGPU_HW_IP_VCN_DEC ||
>> - hw_ip == AMDGPU_HW_IP_UVD_ENC ||
>> - hw_ip == AMDGPU_HW_IP_UVD) {
>> + sched = scheds[0];
>> + aring = container_of(sched, struct amdgpu_ring, sched);
>> +
>> + if (aring->funcs->engine_retains_context) {
>> + /* Disable load balancing between multiple schedulers if the hw
>> + * engine retains context among dependent jobs.
>> + */
>> sched = drm_sched_pick_best(scheds, num_scheds);
>> scheds = &sched;
>> num_scheds = 1;
>> @@ -258,6 +261,12 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
>> if (cmpxchg(&ctx->entities[hw_ip][ring], NULL, entity))
>> goto cleanup_entity;
>>
>> + if (aring->funcs->engine_retains_context) {
>> + aring = container_of(sched, struct amdgpu_ring, sched);
>> + entity->sched_score = aring->sched_score;
>> + atomic_inc(entity->sched_score);
>> + }
>> +
>> return 0;
>>
>> cleanup_entity:
>> @@ -514,6 +523,9 @@ static void amdgpu_ctx_do_release(struct kref *ref)
>> if (!ctx->entities[i][j])
>> continue;
>>
>> + if (ctx->entities[i][j]->sched_score)
>> + atomic_dec(ctx->entities[i][j]->sched_score);
>> +
>> drm_sched_entity_destroy(&ctx->entities[i][j]->entity);
>> }
>> }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
>> index 090dfe86f75b..f7b44f96f374 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
>> @@ -39,6 +39,7 @@ struct amdgpu_ctx_entity {
>> uint32_t hw_ip;
>> uint64_t sequence;
>> struct drm_sched_entity entity;
>> + atomic_t *sched_score;
>
> I would rather prefer to not have that additional member here.
>
> Additional to that we are messing with the internals of the scheduler here and should probably have two clean functions to increase/decrease the score.
Don't forget it is a driver owned atomic_t so I think it is fine driver
manipulates it on its own.
Regards,
Tvrtko
> Regards,
> Christian.
>
>> struct dma_fence *fences[];
>> };
>>
>
Powered by blists - more mailing lists