lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 26 Apr 2024 11:16:23 +0100
From: Luis Machado <luis.machado@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: mingo@...hat.com, juri.lelli@...hat.com, vincent.guittot@...aro.org,
 dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
 mgorman@...e.de, bristot@...hat.com, vschneid@...hat.com,
 linux-kernel@...r.kernel.org, kprateek.nayak@....com,
 wuyun.abel@...edance.com, tglx@...utronix.de, efault@....de, nd
 <nd@....com>, John Stultz <jstultz@...gle.com>, Hongyan.Xia2@....com
Subject: Re: [RFC][PATCH 08/10] sched/fair: Implement delayed dequeue

On 4/26/24 10:32, Peter Zijlstra wrote:
> On Thu, Apr 25, 2024 at 01:49:49PM +0200, Peter Zijlstra wrote:
>> On Thu, Apr 25, 2024 at 12:42:20PM +0200, Peter Zijlstra wrote:
>>
>>>> I wonder if the delayed dequeue logic is having an unwanted effect on the calculation of
>>>> utilization/load of the runqueue and, as a consequence, we're scheduling things to run on
>>>> higher OPP's in the big cores, leading to poor decisions for energy efficiency.
>>>
>>> Notably util_est_update() gets delayed. Given we don't actually do an
>>> enqueue when a delayed task gets woken, it didn't seem to make sense to
>>> update that sooner.
>>
>> The PELT runnable values will be inflated because of delayed dequeue.
>> cpu_util() uses those in the @boost case, and as such this can indeed
>> affect things.
>>
>> This can also slightly affect the cgroup case, but since the delay goes
>> away as contention goes away, and the cgroup case must already assume
>> worst case overlap, this seems limited.
>>
>> /me goes ponder things moar.
> 
> First order approximation of a fix would be something like the totally
> untested below I suppose...


Thanks Peter. Let me give it a try and I'll report back.

> 
> ---
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index cfd1fd188d29..f3f70b5adca0 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5391,6 +5391,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
>  			if (cfs_rq->next == se)
>  				cfs_rq->next = NULL;
>  			se->sched_delayed = 1;
> +			update_load_avg(cfs_rq, se, 0);
>  			return false;
>  		}
>  	}
> @@ -6817,6 +6818,7 @@ requeue_delayed_entity(struct sched_entity *se)
>  	}
>  
>  	se->sched_delayed = 0;
> +	update_load_avg(qcfs_rq, se, 0);
>  }
>  
>  /*
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index d07a3b98f1fb..d16529613123 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -810,6 +810,9 @@ static inline void se_update_runnable(struct sched_entity *se)
>  
>  static inline long se_runnable(struct sched_entity *se)
>  {
> +	if (se->sched_delayed)
> +		return false;
> +
>  	if (entity_is_task(se))
>  		return !!se->on_rq;
>  	else
> @@ -823,6 +826,9 @@ static inline void se_update_runnable(struct sched_entity *se) {}
>  
>  static inline long se_runnable(struct sched_entity *se)
>  {
> +	if (se->sched_delayed)
> +		return false;
> +
>  	return !!se->on_rq;
>  }
>  #endif


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ