lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241128092750.2541735-1-vincent.guittot@linaro.org>
Date: Thu, 28 Nov 2024 10:27:41 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: mingo@...hat.com,
	peterz@...radead.org,
	juri.lelli@...hat.com,
	dietmar.eggemann@....com,
	rostedt@...dmis.org,
	bsegall@...gle.com,
	mgorman@...e.de,
	vschneid@...hat.com,
	linux-kernel@...r.kernel.org
Cc: kprateek.nayak@....com,
	pauld@...hat.com,
	efault@....de,
	luis.machado@....com,
	Vincent Guittot <vincent.guittot@...aro.org>
Subject: [PATCH 0/9] sched/fair: Fix statistics with delayed dequeue

Delayed dequeued feature keeps a sleeping sched_entitiy enqueued until its
lag has elapsed. As a result, it stays also visible in the statistics that
are used to balance the system and in particular the field h_nr_running.

This serie fixes those metrics by creating a new h_nr_enqueued that tracks
all enqueued tasks and restore the behavior of h_nr_running i.e. tracking
the number of fair tasks that want to run.

h_nr_running is used in several places to make decision on load balance:
  - PELT runnable_avg
  - deciding if a group is overloaded or has spare capacity
  - numa stats
  - reduced capacity management
  - load balance between groups

While fixing h_nr_running, some fields have been renamed to follow the
same pattern. We now have:
  - cfs.h_nr_running : running tasks in the hierarchy
  - cfs.h_nr_enqueued : enqueued tasks in the hierarchy either running or
      delayed dequeue
  - cfs.h_nr_idle : enqueued sched idle tasks in the hierarchy

cfs.nr_running has been rename cfs.nr_enqueued because it includes the
delayed dequeued entities

The unused cfs.idle_nr_running has been removed

Load balance compares the number of running tasks when selecting the
busiest group or runqueue and tries to migrate a runnable task and not a
sleeping delayed dequeue one.

It should be noticed that this serie doesn't fix the problem of delayed
dequeued tasks that can't migrate at wakeup.

Some additional cleanups have been added:
  - move variable declaration at the beginning of pick_next_entity() 
  - sched_can_stop_tick() should use cfs.h_nr_enqueued instead of
    cfs.nr_enqueued (previously cfs.nr_running) to know how many tasks
    are running in the whole hierarchy instead of how many entities at
    root level


Peter Zijlstra (1):
  sched/eevdf: More PELT vs DELAYED_DEQUEUE

Vincent Guittot (8):
  sched/fair: Add new cfs_rq.h_nr_enqueued
  sched/fair: Rename cfs_rq.idle_h_nr_running into h_nr_idle
  sched/fair: Remove unused cfs_rq.idle_nr_running
  sched/fair: Rename cfs_rq.nr_running into nr_enqueued
  sched/fair: Removed unsued cfs_rq.h_nr_delayed
  sched/fair: Do not try to migrate delayed dequeue task
  sched/fair: Fix sched_can_stop_tick() for fair tasks
  sched/fair: Fix variable declaration position

 kernel/sched/core.c  |   4 +-
 kernel/sched/debug.c |  13 ++-
 kernel/sched/fair.c  | 214 +++++++++++++++++++++++++------------------
 kernel/sched/sched.h |   8 +-
 4 files changed, 138 insertions(+), 101 deletions(-)

-- 
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ