lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240324004552.999936-1-qyousef@layalina.io>
Date: Sun, 24 Mar 2024 00:45:48 +0000
From: Qais Yousef <qyousef@...alina.io>
To: Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>
Cc: linux-kernel@...r.kernel.org,
	"Pierre Gondois" <Pierre.Gondois@....com>,
	Qais Yousef <qyousef@...alina.io>
Subject: [PATCH v8 0/4] sched: Don't trigger misfit if affinity is restricted

There was a discussion on handling hotplug operation removing a capacity level
and lead to unnecessary misfit lb to trigger again. I opted not to handle it
now, but a working patch is available in [1]. I don't feel strongly about it
and would leave it up to the maintainers to push which direction they prefer.
Patch 4 will make sure that balance interval and nr_failed won't grow
unnecessarily due to bad unnecessary misfit lb. It will lead to some
sub-optimality, but no incorrect behavior.

After 6.9 merge window, dynamic Energy Model series would be merged and it can
lead to the capacities of the CPUs being changed at runtime. This means I need
to post follow up patch to handle this situation to ensure max_allowed_capacity
is correct after an EM update. It might make then handling of hotplug operation
attractive too as there would be some common shared ground.

[1] https://lore.kernel.org/lkml/20240321122039.7gk2mc3syvkrvhjz@airbuntu/

Changes since v7:

	* Remove sd arg from check_misfit_status()
	* Update typo in commit message in patch 2.
	* Add Reviewed-by from Vincent

Changes since v6:

	* Simplify update_misfit_status

Changes since v5:

	* Remove redundant check to rq->rd->max_cpu_capacity
	* Simplify check_misfit_status() further by removing unnecessary checks.
	* Add new patch to remove no longer used rd->max_cpu_capacity
	* Add new patch to prevent misfit lb from polluting balance_interval
	  and nr_balance_failed

Changes since v4:

	* Store max_allowed_capacity in task_struct and populate it when
	  affinity changes to avoid iterating through the capacities list in the
	  fast path (Vincent)
	* Use rq->rd->max_cpu_capacity which is updated after hotplug
	  operations to check biggest allowed capacity in the system.
	* Undo the change to check_misfit_status() and improve the function to
	  avoid similar confusion in the future.
	* Split the patches differently. Export the capacity list and sort it
	  is now patch 1, handling of affinity for misfit detection is patch 2.

Changes since v3:

	* Update commit message of patch 2 to be less verbose

Changes since v2:

	* Convert access of asym_cap_list to be rcu protected
	* Add new patch to sort the list in descending order
	* Move some declarations inside affinity check block
	* Remove now redundant check against max_cpu_capacity in check_misfit_status()

Changes since v1:

	* Use asym_cap_list (thanks Dietmar) to iterate instead of iterating
	  through every cpu which Vincent was concerned about.
	* Use uclamped util to compare with capacity instead of util_fits_cpu()
	  when iterating through capcities (Dietmar).
	* Update commit log with test results to better demonstrate the problem

v1 discussion: https://lore.kernel.org/lkml/20230820203429.568884-1-qyousef@layalina.io/
v2 discussion: https://lore.kernel.org/lkml/20231212154056.626978-1-qyousef@layalina.io/
v3 discussion: https://lore.kernel.org/lkml/20231231175218.510721-1-qyousef@layalina.io/
v4 discussion: https://lore.kernel.org/lkml/20240105222014.1025040-1-qyousef@layalina.io/
v5 discussion: https://lore.kernel.org/lkml/20240205021123.2225933-1-qyousef@layalina.io/
v6, v7 discussion: https://lore.kernel.org/lkml/20240220225622.2626569-1-qyousef@layalina.io/

Thanks!

--
Qais Yousef

Qais Yousef (4):
  sched/topology: Export asym_capacity_list
  sched/fair: Check a task has a fitting cpu when updating misfit
  sched/topology: Remove max_cpu_capacity from root_domain
  sched/fair: Don't double balance_interval for migrate_misfit

 include/linux/sched.h   |  1 +
 init/init_task.c        |  1 +
 kernel/sched/fair.c     | 79 +++++++++++++++++++++++++++++++----------
 kernel/sched/sched.h    | 16 +++++++--
 kernel/sched/topology.c | 56 ++++++++++++++---------------
 5 files changed, 104 insertions(+), 49 deletions(-)

-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ