linux-kernel - Re: [PATCH v4 2/3] drm/amdgpu: increment sched score on entity selection

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <9950dd13-d5c1-4b34-b3f9-2528a1ffb989@igalia.com>
Date: Fri, 7 Nov 2025 10:39:26 +0000
From: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>
To: Christian König <christian.koenig@....com>,
 Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@....com>,
 Alex Deucher <alexander.deucher@....com>, David Airlie <airlied@...il.com>,
 Simona Vetter <simona@...ll.ch>
Cc: amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 2/3] drm/amdgpu: increment sched score on entity
 selection


On 07/11/2025 10:26, Christian König wrote:
> On 11/7/25 10:04, Pierre-Eric Pelloux-Prayer wrote:
>> For hw engines that can't load balance jobs, entities are
>> "statically" load balanced: on their first submit, they select
>> the best scheduler based on its score.
>> The score is made up of 2 parts:
>> * the job queue depth (how much jobs are executing/waiting)
>> * the number of entities assigned
>>
>> The second part is only relevant for the static load balance:
>> it's a way to consider how many entities are attached to this
>> scheduler, knowing that if they ever submit jobs they will go
>> to this one.
>>
>> For rings that can load balance jobs freely, idle entities
>> aren't a concern and shouldn't impact the scheduler's decisions.
>>
>> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@....com>
>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 22 +++++++++++++++++-----
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h |  1 +
>>   2 files changed, 18 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> index afedea02188d..4d91cbcbcf25 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> @@ -209,6 +209,7 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
>>   	struct amdgpu_ctx_entity *entity;
>>   	enum drm_sched_priority drm_prio;
>>   	unsigned int hw_prio, num_scheds;
>> +	struct amdgpu_ring *aring;
>>   	int32_t ctx_prio;
>>   	int r;
>>   
>> @@ -239,11 +240,13 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
>>   			goto error_free_entity;
>>   	}
>>   
>> -	/* disable load balance if the hw engine retains context among dependent jobs */
>> -	if (hw_ip == AMDGPU_HW_IP_VCN_ENC ||
>> -	    hw_ip == AMDGPU_HW_IP_VCN_DEC ||
>> -	    hw_ip == AMDGPU_HW_IP_UVD_ENC ||
>> -	    hw_ip == AMDGPU_HW_IP_UVD) {
>> +	sched = scheds[0];
>> +	aring = container_of(sched, struct amdgpu_ring, sched);
>> +
>> +	if (aring->funcs->engine_retains_context) {
>> +		/* Disable load balancing between multiple schedulers if the hw
>> +		 * engine retains context among dependent jobs.
>> +		 */
>>   		sched = drm_sched_pick_best(scheds, num_scheds);
>>   		scheds = &sched;
>>   		num_scheds = 1;
>> @@ -258,6 +261,12 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
>>   	if (cmpxchg(&ctx->entities[hw_ip][ring], NULL, entity))
>>   		goto cleanup_entity;
>>   
>> +	if (aring->funcs->engine_retains_context) {
>> +		aring = container_of(sched, struct amdgpu_ring, sched);
>> +		entity->sched_score = aring->sched_score;
>> +		atomic_inc(entity->sched_score);
>> +	}
>> +
>>   	return 0;
>>   
>>   cleanup_entity:
>> @@ -514,6 +523,9 @@ static void amdgpu_ctx_do_release(struct kref *ref)
>>   			if (!ctx->entities[i][j])
>>   				continue;
>>   
>> +			if (ctx->entities[i][j]->sched_score)
>> +				atomic_dec(ctx->entities[i][j]->sched_score);
>> +
>>   			drm_sched_entity_destroy(&ctx->entities[i][j]->entity);
>>   		}
>>   	}
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
>> index 090dfe86f75b..f7b44f96f374 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
>> @@ -39,6 +39,7 @@ struct amdgpu_ctx_entity {
>>   	uint32_t		hw_ip;
>>   	uint64_t		sequence;
>>   	struct drm_sched_entity	entity;
>> +	atomic_t		*sched_score;
> 
> I would rather prefer to not have that additional member here.
> 
> Additional to that we are messing with the internals of the scheduler here and should probably have two clean functions to increase/decrease the score.

Don't forget it is a driver owned atomic_t so I think it is fine driver 
manipulates it on its own.

Regards,

Tvrtko


> Regards,
> Christian.
> 
>>   	struct dma_fence	*fences[];
>>   };
>>   
>