lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <95b655c2-dd60-488e-ab07-c7b958da1457@arm.com>
Date: Tue, 3 Dec 2024 11:41:47 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Vincent Guittot <vincent.guittot@...aro.org>, mingo@...hat.com,
 peterz@...radead.org, juri.lelli@...hat.com, rostedt@...dmis.org,
 bsegall@...gle.com, mgorman@...e.de, vschneid@...hat.com,
 linux-kernel@...r.kernel.org
Cc: kprateek.nayak@....com, pauld@...hat.com, efault@....de,
 luis.machado@....com, tj@...nel.org, void@...ifault.com
Subject: Re: [PATCH 0/11 v3] sched/fair: Fix statistics with delayed dequeue

On 02/12/2024 18:45, Vincent Guittot wrote:
> Delayed dequeued feature keeps a sleeping sched_entitiy enqueued until its
> lag has elapsed. As a result, it stays also visible in the statistics that
> are used to balance the system and in particular the field h_nr_running.
> 
> This serie fixes those metrics by creating a new h_nr_runnable that tracks
> only tasks that want to run. It renames h_nr_running into h_nr_runnable.
> 
> h_nr_runnable is used in several places to make decision on load balance:
>   - PELT runnable_avg
>   - deciding if a group is overloaded or has spare capacity
>   - numa stats
>   - reduced capacity management
>   - load balance between groups
> 
> While fixing h_nr_running, some fields have been renamed to follow the
> same pattern. We now have:
>   - cfs.h_nr_runnable : running tasks in the hierarchy
>   - cfs.h_nr_queued : enqueued tasks in the hierarchy either running or
>       delayed dequeue
>   - cfs.h_nr_idle : enqueued sched idle tasks in the hierarchy
> 
> cfs.nr_running has been rename cfs.nr_queued because it includes the
> delayed dequeued entities
> 
> The unused cfs.idle_nr_running has been removed
> 
> Load balance compares the number of running tasks when selecting the
> busiest group or runqueue and tries to migrate a runnable task and not a
> sleeping delayed dequeue one. delayed dequeue tasks are considered only
> when migrating load as they continue to impact it.
> 
> It should be noticed that this serie doesn't fix the problem of delayed
> dequeued tasks that can't migrate at wakeup.
> 
> Some additional cleanups have been added:
>   - move variable declaration at the beginning of pick_next_entity()
>     and dequeue_entity() 
>   - sched_can_stop_tick() should use cfs.h_nr_queued instead of
>     cfs.nr_queued (previously cfs.nr_running) to know how many tasks
>     are running in the whole hierarchy instead of how many entities at
>     root level
> 
> Changes since v2:
> - Fix h_nr_runnable after removing h_nr_delayed (reported by Mike and Prateek)
> - Move "sched/fair: Fix sched_can_stop_tick() for fair tasks" at the
>   beginning of the series so it can be easily backported (asked by Prateek)
> - Split "sched/fair: Add new cfs_rq.h_nr_runnable" in 2 patches. One
>   for the creation of h_nr_runnable and one for its use (asked by Peter)
> - Fix more variable declarations (reported Prateek)

with the following nits:

(1) 01/11

    Proposed 'Fixes:' missing:
    https://lkml.kernel.org/r/c82ed217-cfe4-41a4-b39a-e7356231835f@amd.com

(2) 08/11

    Would be helpful to point out that we lost the only use case for 
    'cfs_rq->idle_nr_running' with the advent of EEVDF with:

    5e963f2bd465 - sched/fair: Commit to EEVDF

(3) Using nr_running on rq/rt_rq/dl_rq and nr_queued 
    for cfs_rq might look strange to the untrained eye.

Reviewed-by: Dietmar Eggemann <dietmar.eggemann@....com>

[...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ