[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0f4c18b7-4591-4450-8d55-b7bfe789d9fa@damsy.net>
Date: Wed, 3 Sep 2025 10:48:58 +0200
From: Pierre-Eric Pelloux-Prayer <pierre-eric@...sy.net>
To: Tvrtko Ursulin <tursulin@...ulin.net>,
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@....com>,
Alex Deucher <alexander.deucher@....com>,
Christian König <christian.koenig@....com>,
David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>
Cc: amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity
selection
Hi,
Le 01/09/2025 à 11:02, Tvrtko Ursulin a écrit :
>
> On 22/08/2025 14:43, Pierre-Eric Pelloux-Prayer wrote:
>> For hw engines that can't load balance jobs, entities are
>> "statically" load balanced: on their first submit, they select
>> the best scheduler based on its score.
>> The score is made up of 2 parts:
>> * the job queue depth (how much jobs are executing/waiting)
>> * the number of entities assigned
>>
>> The second part is only relevant for the static load balance:
>> it's a way to consider how many entities are attached to this
>> scheduler, knowing that if they ever submit jobs they will go
>> to this one.
>>
>> For rings that can load balance jobs freely, idle entities
>> aren't a concern and shouldn't impact the scheduler's decisions.
>>
>> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@....com>
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 23 ++++++++++++++++++-----
>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h | 1 +
>> 2 files changed, 19 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/
>> amdgpu/amdgpu_ctx.c
>> index f5d5c45ddc0d..4a078d2d98c5 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> @@ -206,9 +206,11 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx,
>> u32 hw_ip,
>> {
>> struct drm_gpu_scheduler **scheds = NULL, *sched = NULL;
>> struct amdgpu_device *adev = ctx->mgr->adev;
>> + bool static_load_balancing = false;
>> struct amdgpu_ctx_entity *entity;
>> enum drm_sched_priority drm_prio;
>> unsigned int hw_prio, num_scheds;
>> + struct amdgpu_ring *aring;
>> int32_t ctx_prio;
>> int r;
>> @@ -236,17 +238,22 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx
>> *ctx, u32 hw_ip,
>> r = amdgpu_xcp_select_scheds(adev, hw_ip, hw_prio, fpriv,
>> &num_scheds, &scheds);
>> if (r)
>> - goto cleanup_entity;
>> + goto error_free_entity;
>
> Indeed, previously this was calling drm_sched_entity_fini() before
> drm_sched_entity_init() and it only worked because of kzalloc.
>
>> }
>> /* disable load balance if the hw engine retains context among dependent
>> jobs */
>> - if (hw_ip == AMDGPU_HW_IP_VCN_ENC ||
>> - hw_ip == AMDGPU_HW_IP_VCN_DEC ||
>> - hw_ip == AMDGPU_HW_IP_UVD_ENC ||
>> - hw_ip == AMDGPU_HW_IP_UVD) {
>> + static_load_balancing = hw_ip == AMDGPU_HW_IP_VCN_ENC ||
>> + hw_ip == AMDGPU_HW_IP_VCN_DEC ||
>> + hw_ip == AMDGPU_HW_IP_UVD_ENC ||
>> + hw_ip == AMDGPU_HW_IP_UVD;
>> +
>> + if (static_load_balancing) {
>> sched = drm_sched_pick_best(scheds, num_scheds);
>> scheds = &sched;
>> num_scheds = 1;
>> + aring = container_of(sched, struct amdgpu_ring, sched);
>> + entity->sched_ring_score = aring->sched_score;
>> + atomic_inc(entity->sched_ring_score);
>
> If we were to bike-shed we could find a way to avoid the new local variables.
> Keeping the if as is and assign to entity->sched_ring_score directly, and then
> checking for that on the cleanup path. Still works due kzalloc. Or if relying on
> kzalloc is not desired, at least bool static_load_balance could be replaced by
> re-naming the aring local as static_aring and using it like the name suggests.
>
> Could also move the atomic_inc to the success path to avoid having to add code
> to error unwind.
>
Christian, Alex: any preferences on the code style to use here?
> Both cases are I think equally racy in the sense that parallel
> amdgpu_ctx_init_entity invocations can all pick the same sched. But that is true
> today AFAICT because score is not incremented until later in the job submit
> process.
>
> I suppose one way to make the assignment more robust would be to "rotate" (or
> randomize) the sched list atomically before calling drm_sched_pick_best. Thoughts?
>
IMO without the ability to do job level load balancing, there will always be
cases where the scheduler choice is not optimal. For this reason I prefered to
keep the code as simple as possible.
Thanks,
Pierre-Eric
> Regards,
>
> Tvrtko
>
>> }
>> r = drm_sched_entity_init(&entity->entity, drm_prio, scheds, num_scheds,
>> @@ -264,6 +271,9 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx,
>> u32 hw_ip,
>> drm_sched_entity_fini(&entity->entity);
>> error_free_entity:
>> + if (static_load_balancing)
>> + atomic_dec(entity->sched_ring_score);
>> +
>> kfree(entity);
>> return r;
>> @@ -514,6 +524,9 @@ static void amdgpu_ctx_do_release(struct kref *ref)
>> if (!ctx->entities[i][j])
>> continue;
>> + if (ctx->entities[i][j]->sched_ring_score)
>> + atomic_dec(ctx->entities[i][j]->sched_ring_score);
>> +
>> drm_sched_entity_destroy(&ctx->entities[i][j]->entity);
>> }
>> }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/
>> amdgpu/amdgpu_ctx.h
>> index 090dfe86f75b..076a0e165ce0 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
>> @@ -39,6 +39,7 @@ struct amdgpu_ctx_entity {
>> uint32_t hw_ip;
>> uint64_t sequence;
>> struct drm_sched_entity entity;
>> + atomic_t *sched_ring_score;
>> struct dma_fence *fences[];
>> };
Powered by blists - more mailing lists