linux-kernel - Re: [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0f4c18b7-4591-4450-8d55-b7bfe789d9fa@damsy.net>
Date: Wed, 3 Sep 2025 10:48:58 +0200
From: Pierre-Eric Pelloux-Prayer <pierre-eric@...sy.net>
To: Tvrtko Ursulin <tursulin@...ulin.net>,
 Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@....com>,
 Alex Deucher <alexander.deucher@....com>,
 Christian König <christian.koenig@....com>,
 David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>
Cc: amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity
 selection

Hi,

Le 01/09/2025 à 11:02, Tvrtko Ursulin a écrit :
> 
> On 22/08/2025 14:43, Pierre-Eric Pelloux-Prayer wrote:
>> For hw engines that can't load balance jobs, entities are
>> "statically" load balanced: on their first submit, they select
>> the best scheduler based on its score.
>> The score is made up of 2 parts:
>> * the job queue depth (how much jobs are executing/waiting)
>> * the number of entities assigned
>>
>> The second part is only relevant for the static load balance:
>> it's a way to consider how many entities are attached to this
>> scheduler, knowing that if they ever submit jobs they will go
>> to this one.
>>
>> For rings that can load balance jobs freely, idle entities
>> aren't a concern and shouldn't impact the scheduler's decisions.
>>
>> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@....com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 23 ++++++++++++++++++-----
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h |  1 +
>>   2 files changed, 19 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/ 
>> amdgpu/amdgpu_ctx.c
>> index f5d5c45ddc0d..4a078d2d98c5 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> @@ -206,9 +206,11 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, 
>> u32 hw_ip,
>>   {
>>       struct drm_gpu_scheduler **scheds = NULL, *sched = NULL;
>>       struct amdgpu_device *adev = ctx->mgr->adev;
>> +    bool static_load_balancing = false;
>>       struct amdgpu_ctx_entity *entity;
>>       enum drm_sched_priority drm_prio;
>>       unsigned int hw_prio, num_scheds;
>> +    struct amdgpu_ring *aring;
>>       int32_t ctx_prio;
>>       int r;
>> @@ -236,17 +238,22 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx 
>> *ctx, u32 hw_ip,
>>           r = amdgpu_xcp_select_scheds(adev, hw_ip, hw_prio, fpriv,
>>                           &num_scheds, &scheds);
>>           if (r)
>> -            goto cleanup_entity;
>> +            goto error_free_entity;
> 
> Indeed, previously this was calling drm_sched_entity_fini() before 
> drm_sched_entity_init() and it only worked because of kzalloc.
> 
>>       }
>>       /* disable load balance if the hw engine retains context among dependent 
>> jobs */
>> -    if (hw_ip == AMDGPU_HW_IP_VCN_ENC ||
>> -        hw_ip == AMDGPU_HW_IP_VCN_DEC ||
>> -        hw_ip == AMDGPU_HW_IP_UVD_ENC ||
>> -        hw_ip == AMDGPU_HW_IP_UVD) {
>> +    static_load_balancing = hw_ip == AMDGPU_HW_IP_VCN_ENC ||
>> +                hw_ip == AMDGPU_HW_IP_VCN_DEC ||
>> +                hw_ip == AMDGPU_HW_IP_UVD_ENC ||
>> +                hw_ip == AMDGPU_HW_IP_UVD;
>> +
>> +    if (static_load_balancing) {
>>           sched = drm_sched_pick_best(scheds, num_scheds);
>>           scheds = &sched;
>>           num_scheds = 1;
>> +        aring = container_of(sched, struct amdgpu_ring, sched);
>> +        entity->sched_ring_score = aring->sched_score;
>> +        atomic_inc(entity->sched_ring_score);
> 
> If we were to bike-shed we could find a way to avoid the new local variables. 
> Keeping the if as is and assign to entity->sched_ring_score directly, and then 
> checking for that on the cleanup path. Still works due kzalloc. Or if relying on 
> kzalloc is not desired, at least bool static_load_balance could be replaced by 
> re-naming the aring local as static_aring and using it like the name suggests.
> 
> Could also move the atomic_inc to the success path to avoid having to add code 
> to error unwind.
> 

Christian, Alex: any preferences on the code style to use here?

> Both cases are I think equally racy in the sense that parallel 
> amdgpu_ctx_init_entity invocations can all pick the same sched. But that is true 
> today AFAICT because score is not incremented until later in the job submit 
> process.
> 
> I suppose one way to make the assignment more robust would be to "rotate" (or 
> randomize) the sched list atomically before calling drm_sched_pick_best. Thoughts?
> 

IMO without the ability to do job level load balancing, there will always be 
cases where the scheduler choice is not optimal. For this reason I prefered to 
keep the code as simple as possible.

Thanks,
Pierre-Eric


> Regards,
> 
> Tvrtko
> 
>>       }
>>       r = drm_sched_entity_init(&entity->entity, drm_prio, scheds, num_scheds,
>> @@ -264,6 +271,9 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, 
>> u32 hw_ip,
>>       drm_sched_entity_fini(&entity->entity);
>>   error_free_entity:
>> +    if (static_load_balancing)
>> +        atomic_dec(entity->sched_ring_score);
>> +
>>       kfree(entity);
>>       return r;
>> @@ -514,6 +524,9 @@ static void amdgpu_ctx_do_release(struct kref *ref)
>>               if (!ctx->entities[i][j])
>>                   continue;
>> +            if (ctx->entities[i][j]->sched_ring_score)
>> +                atomic_dec(ctx->entities[i][j]->sched_ring_score);
>> +
>>               drm_sched_entity_destroy(&ctx->entities[i][j]->entity);
>>           }
>>       }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/ 
>> amdgpu/amdgpu_ctx.h
>> index 090dfe86f75b..076a0e165ce0 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
>> @@ -39,6 +39,7 @@ struct amdgpu_ctx_entity {
>>       uint32_t        hw_ip;
>>       uint64_t        sequence;
>>       struct drm_sched_entity    entity;
>> +    atomic_t        *sched_ring_score;
>>       struct dma_fence    *fences[];
>>   };