lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241202174606.4074512-1-vincent.guittot@linaro.org>
Date: Mon,  2 Dec 2024 18:45:55 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: mingo@...hat.com,
	peterz@...radead.org,
	juri.lelli@...hat.com,
	dietmar.eggemann@....com,
	rostedt@...dmis.org,
	bsegall@...gle.com,
	mgorman@...e.de,
	vschneid@...hat.com,
	linux-kernel@...r.kernel.org
Cc: kprateek.nayak@....com,
	pauld@...hat.com,
	efault@....de,
	luis.machado@....com,
	tj@...nel.org,
	void@...ifault.com,
	Vincent Guittot <vincent.guittot@...aro.org>
Subject: [PATCH 0/11 v3] sched/fair: Fix statistics with delayed dequeue

Delayed dequeued feature keeps a sleeping sched_entitiy enqueued until its
lag has elapsed. As a result, it stays also visible in the statistics that
are used to balance the system and in particular the field h_nr_running.

This serie fixes those metrics by creating a new h_nr_runnable that tracks
only tasks that want to run. It renames h_nr_running into h_nr_runnable.

h_nr_runnable is used in several places to make decision on load balance:
  - PELT runnable_avg
  - deciding if a group is overloaded or has spare capacity
  - numa stats
  - reduced capacity management
  - load balance between groups

While fixing h_nr_running, some fields have been renamed to follow the
same pattern. We now have:
  - cfs.h_nr_runnable : running tasks in the hierarchy
  - cfs.h_nr_queued : enqueued tasks in the hierarchy either running or
      delayed dequeue
  - cfs.h_nr_idle : enqueued sched idle tasks in the hierarchy

cfs.nr_running has been rename cfs.nr_queued because it includes the
delayed dequeued entities

The unused cfs.idle_nr_running has been removed

Load balance compares the number of running tasks when selecting the
busiest group or runqueue and tries to migrate a runnable task and not a
sleeping delayed dequeue one. delayed dequeue tasks are considered only
when migrating load as they continue to impact it.

It should be noticed that this serie doesn't fix the problem of delayed
dequeued tasks that can't migrate at wakeup.

Some additional cleanups have been added:
  - move variable declaration at the beginning of pick_next_entity()
    and dequeue_entity() 
  - sched_can_stop_tick() should use cfs.h_nr_queued instead of
    cfs.nr_queued (previously cfs.nr_running) to know how many tasks
    are running in the whole hierarchy instead of how many entities at
    root level

Changes since v2:
- Fix h_nr_runnable after removing h_nr_delayed (reported by Mike and Prateek)
- Move "sched/fair: Fix sched_can_stop_tick() for fair tasks" at the
  beginning of the series so it can be easily backported (asked by Prateek)
- Split "sched/fair: Add new cfs_rq.h_nr_runnable" in 2 patches. One
  for the creation of h_nr_runnable and one for its use (asked by Peter)
- Fix more variable declarations (reported Prateek)


Changes since v1:
- reorder the patches
- rename fields into:
  - h_nr_queued for all tasks queued both runnable and delayed dequeue
  - h_nr_runnable for all runnable tasks
  - h_nr_idle for all tasks with sched_idle policy
- Cleanup how h_nr_runnable is updated in enqueue_task_fair() and
  dequeue_entities

Peter Zijlstra (1):
  sched/eevdf: More PELT vs DELAYED_DEQUEUE

Vincent Guittot (10):
  sched/fair: Fix sched_can_stop_tick() for fair tasks
  sched/fair: Rename h_nr_running into h_nr_queued
  sched/fair: Add new cfs_rq.h_nr_runnable
  sched/fair: Use the new cfs_rq.h_nr_runnable
  sched/fair: Removed unsued cfs_rq.h_nr_delayed
  sched/fair: Rename cfs_rq.idle_h_nr_running into h_nr_idle
  sched/fair: Remove unused cfs_rq.idle_nr_running
  sched/fair: Rename cfs_rq.nr_running into nr_queued
  sched/fair: Do not try to migrate delayed dequeue task
  sched/fair: Fix variable declaration position

 kernel/sched/core.c  |   4 +-
 kernel/sched/debug.c |  14 ++-
 kernel/sched/fair.c  | 240 ++++++++++++++++++++++++-------------------
 kernel/sched/pelt.c  |   4 +-
 kernel/sched/sched.h |  12 +--
 5 files changed, 153 insertions(+), 121 deletions(-)

-- 
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ