lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240329020614.ntohx3unopuo5ttl@airbuntu>
Date: Fri, 29 Mar 2024 02:06:14 +0000
From: Qais Yousef <qyousef@...alina.io>
To: Ingo Molnar <mingo@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>
Cc: linux-kernel@...r.kernel.org, Pierre Gondois <Pierre.Gondois@....com>,
	Lukasz Luba <lukasz.luba@....com>
Subject: Re: [PATCH v8 0/4] sched: Don't trigger misfit if affinity is
 restricted

+Lukasz

On 03/24/24 00:45, Qais Yousef wrote:
> There was a discussion on handling hotplug operation removing a capacity level
> and lead to unnecessary misfit lb to trigger again. I opted not to handle it
> now, but a working patch is available in [1]. I don't feel strongly about it
> and would leave it up to the maintainers to push which direction they prefer.
> Patch 4 will make sure that balance interval and nr_failed won't grow
> unnecessarily due to bad unnecessary misfit lb. It will lead to some
> sub-optimality, but no incorrect behavior.
> 
> After 6.9 merge window, dynamic Energy Model series would be merged and it can
> lead to the capacities of the CPUs being changed at runtime. This means I need
> to post follow up patch to handle this situation to ensure max_allowed_capacity
> is correct after an EM update. It might make then handling of hotplug operation
> attractive too as there would be some common shared ground.

I was trying to work on this follow up patch now tip has moved to 6.9-rc1, but
I can't see how the new dynamic EM logic will trigger an update to
asym_cap_list. Did I miss something? Will/should init_cpu_capacity_callback()
be triggered after the update?

How will scheduler know the new max capacities are different? Or did
I misunderstand the new EM runtime logic and it won't lead to having a new
arch_scale_cpu_capacity() values?


Thanks!

--
Qais Yousef

> 
> [1] https://lore.kernel.org/lkml/20240321122039.7gk2mc3syvkrvhjz@airbuntu/
> 
> Changes since v7:
> 
> 	* Remove sd arg from check_misfit_status()
> 	* Update typo in commit message in patch 2.
> 	* Add Reviewed-by from Vincent
> 
> Changes since v6:
> 
> 	* Simplify update_misfit_status
> 
> Changes since v5:
> 
> 	* Remove redundant check to rq->rd->max_cpu_capacity
> 	* Simplify check_misfit_status() further by removing unnecessary checks.
> 	* Add new patch to remove no longer used rd->max_cpu_capacity
> 	* Add new patch to prevent misfit lb from polluting balance_interval
> 	  and nr_balance_failed
> 
> Changes since v4:
> 
> 	* Store max_allowed_capacity in task_struct and populate it when
> 	  affinity changes to avoid iterating through the capacities list in the
> 	  fast path (Vincent)
> 	* Use rq->rd->max_cpu_capacity which is updated after hotplug
> 	  operations to check biggest allowed capacity in the system.
> 	* Undo the change to check_misfit_status() and improve the function to
> 	  avoid similar confusion in the future.
> 	* Split the patches differently. Export the capacity list and sort it
> 	  is now patch 1, handling of affinity for misfit detection is patch 2.
> 
> Changes since v3:
> 
> 	* Update commit message of patch 2 to be less verbose
> 
> Changes since v2:
> 
> 	* Convert access of asym_cap_list to be rcu protected
> 	* Add new patch to sort the list in descending order
> 	* Move some declarations inside affinity check block
> 	* Remove now redundant check against max_cpu_capacity in check_misfit_status()
> 
> Changes since v1:
> 
> 	* Use asym_cap_list (thanks Dietmar) to iterate instead of iterating
> 	  through every cpu which Vincent was concerned about.
> 	* Use uclamped util to compare with capacity instead of util_fits_cpu()
> 	  when iterating through capcities (Dietmar).
> 	* Update commit log with test results to better demonstrate the problem
> 
> v1 discussion: https://lore.kernel.org/lkml/20230820203429.568884-1-qyousef@layalina.io/
> v2 discussion: https://lore.kernel.org/lkml/20231212154056.626978-1-qyousef@layalina.io/
> v3 discussion: https://lore.kernel.org/lkml/20231231175218.510721-1-qyousef@layalina.io/
> v4 discussion: https://lore.kernel.org/lkml/20240105222014.1025040-1-qyousef@layalina.io/
> v5 discussion: https://lore.kernel.org/lkml/20240205021123.2225933-1-qyousef@layalina.io/
> v6, v7 discussion: https://lore.kernel.org/lkml/20240220225622.2626569-1-qyousef@layalina.io/
> 
> Thanks!
> 
> --
> Qais Yousef
> 
> Qais Yousef (4):
>   sched/topology: Export asym_capacity_list
>   sched/fair: Check a task has a fitting cpu when updating misfit
>   sched/topology: Remove max_cpu_capacity from root_domain
>   sched/fair: Don't double balance_interval for migrate_misfit
> 
>  include/linux/sched.h   |  1 +
>  init/init_task.c        |  1 +
>  kernel/sched/fair.c     | 79 +++++++++++++++++++++++++++++++----------
>  kernel/sched/sched.h    | 16 +++++++--
>  kernel/sched/topology.c | 56 ++++++++++++++---------------
>  5 files changed, 104 insertions(+), 49 deletions(-)
> 
> -- 
> 2.34.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ