lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1ccbc757-2437-4b40-b3ea-1e6926bc5b0d@amd.com>
Date: Tue, 24 Dec 2024 11:10:48 +0530
From: "Sapkal, Swapnil" <swapnil.sapkal@....com>
To: Shrikanth Hegde <sshegde@...ux.ibm.com>
CC: <dietmar.eggemann@....com>, <rostedt@...dmis.org>, <bsegall@...gle.com>,
	<mgorman@...e.de>, <vschneid@...hat.com>, <iamjoonsoo.kim@....com>,
	<qyousef@...alina.io>, <alexs@...nel.org>, <lukasz.luba@....com>,
	<gautham.shenoy@....com>, <kprateek.nayak@....com>, <ravi.bangoria@....com>,
	<linux-kernel@...r.kernel.org>, <linux-doc@...r.kernel.org>, Adam Li
	<adamli@...amperecomputing.com>, <peterz@...radead.org>, <mingo@...hat.com>,
	<juri.lelli@...hat.com>, <vincent.guittot@...aro.org>, <corbet@....net>
Subject: Re: [PATCH v2 6/6] docs: Update Schedstat version to 17

Hello Shrikanth,

On 12/23/2024 1:05 PM, Shrikanth Hegde wrote:
> 
> 
> On 12/20/24 12:02, Swapnil Sapkal wrote:
>> Update the Schedstat version to 17 as more fields are added to report
>> different kinds of imbalances in the sched domain. Also domain field
>> started printing corresponding domain name.
>>
>> Signed-off-by: Swapnil Sapkal <swapnil.sapkal@....com>
> 
> +Adam who had posted a patch to correct the doc for flip of idle, busy.
> 
> https://lore.kernel.org/all/20241209035428.898293-1- 
> adamli@...amperecomputing.com/
> 

Okay, I was not aware.

>> ---
>>   Documentation/scheduler/sched-stats.rst | 126 ++++++++++++++----------
>>   kernel/sched/stats.c                    |   2 +-
>>   2 files changed, 76 insertions(+), 52 deletions(-)
>>
>> diff --git a/Documentation/scheduler/sched-stats.rst b/Documentation/ 
>> scheduler/sched-stats.rst
>> index 7c2b16c4729d..caea83d91c67 100644
>> --- a/Documentation/scheduler/sched-stats.rst
>> +++ b/Documentation/scheduler/sched-stats.rst
>> @@ -2,6 +2,12 @@
>>   Scheduler Statistics
>>   ====================
>> +Version 17 of schedstats removed 'lb_imbalance' field as it has no
>> +significance anymore and instead added more relevant fields namely
>> +'lb_imbalance_load', 'lb_imbalance_util', 'lb_imbalance_task' and
>> +'lb_imbalance_misfit'. The domain field prints the name of the
>> +corresponding sched domain from this version onwards.
>> +
>>   Version 16 of schedstats changed the order of definitions within
>>   'enum cpu_idle_type', which changed the order of [CPU_MAX_IDLE_TYPES]
>>   columns in show_schedstat(). In particular the position of CPU_IDLE
>> @@ -9,7 +15,9 @@ and __CPU_NOT_IDLE changed places. The size of the 
>> array is unchanged.
>>   Version 15 of schedstats dropped counters for some sched_yield:
>>   yld_exp_empty, yld_act_empty and yld_both_empty. Otherwise, it is
>> -identical to version 14.
>> +identical to version 14. Details are available at
>> +
>> +    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/ 
>> linux.git/tree/Documentation/scheduler/sched-stats.txt?id=1e1dbb259c79b
>>   Version 14 of schedstats includes support for sched_domains, which 
>> hit the
>>   mainline kernel in 2.6.20 although it is identical to the stats from 
>> version
>> @@ -26,7 +34,14 @@ cpus on the machine, while domain0 is the most 
>> tightly focused domain,
>>   sometimes balancing only between pairs of cpus.  At this time, there
>>   are no architectures which need more than three domain levels. The 
>> first
>>   field in the domain stats is a bit map indicating which cpus are 
>> affected
>> -by that domain.
>> +by that domain. Details are available at
>> +
>> +    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/ 
>> linux.git/tree/Documentation/sched-stats.txt?id=b762f3ffb797c
>> +
>> +The schedstat documentation is maintained version 10 onwards and is not
>> +updated for version 11 and 12. The details for version 10 are 
>> available at
>> +
>> +    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/ 
>> linux.git/tree/Documentation/sched-stats.txt?id=1da177e4c3f4
>>   These fields are counters, and only increment.  Programs which make use
>>   of these will need to start with a baseline observation and then 
>> calculate
>> @@ -71,88 +86,97 @@ Domain statistics
>>   -----------------
>>   One of these is produced per domain for each cpu described. (Note 
>> that if
>>   CONFIG_SMP is not defined, *no* domains are utilized and these lines
>> -will not appear in the output.)
>> +will not appear in the output. <name> is an extension to the domain 
>> field
>> +that prints the name of the corresponding sched domain. It can appear in
>> +schedstat version 17 and above, and requires CONFIG_SCHED_DEBUG.)
>> -domain<N> <cpumask> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 
>> 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
>> +domain<N> <name> <cpumask> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 
>> 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 
>> 41 42 43 44 45
>>   The first field is a bit mask indicating what cpus this domain 
>> operates over.
>> -The next 24 are a variety of sched_balance_rq() statistics in grouped 
>> into types
>> -of idleness (idle, busy, and newly idle):
>> +The next 33 are a variety of sched_balance_rq() statistics in grouped 
>> into types
>> +of idleness (busy, idle and newly idle):
>>       1)  # of times in this domain sched_balance_rq() was called when 
>> the
>> +        cpu was busy
>> +    2)  # of times in this domain sched_balance_rq() checked but 
>> found the
>> +        load did not require balancing when busy
>> +    3)  # of times in this domain sched_balance_rq() tried to move 
>> one or
>> +        more tasks and failed, when the cpu was busy
>> +    4)  Total imbalance in load when the cpu was busy
>> +    5)  Total imbalance in utilization when the cpu was busy
>> +    6)  Total imbalance in number of tasks when the cpu was busy
>> +    7)  Total imbalance due to misfit tasks when the cpu was busy
>> +    8)  # of times in this domain pull_task() was called when busy
>> +    9)  # of times in this domain pull_task() was called even though the
>> +        target task was cache-hot when busy
> 
> pull_task has been replaced by detach_task AFAIU. So it makes sense to 
> change this to detach_task?
> 

I went through the git history and got to know that pull_task() had been 
replaced by move_task() which was further replaced by detach_task(). I 
think it makes sense to replace pull_task() by detach_task() in the 
docs. Please let me know if I am missing something.

>> +    10) # of times in this domain sched_balance_rq() was called but 
>> did not
>> +        find a busier queue while the cpu was busy
>> +    11) # of times in this domain a busier queue was found while the cpu
>> +        was busy but no busier group was found
>> +
>> +    12) # of times in this domain sched_balance_rq() was called when the
>>           cpu was idle
>> -    2)  # of times in this domain sched_balance_rq() checked but found
>> +    13) # of times in this domain sched_balance_rq() checked but found
>>           the load did not require balancing when the cpu was idle
>> -    3)  # of times in this domain sched_balance_rq() tried to move 
>> one or
>> +    14) # of times in this domain sched_balance_rq() tried to move 
>> one or
>>           more tasks and failed, when the cpu was idle
>> -    4)  sum of imbalances discovered (if any) with each call to
>> -        sched_balance_rq() in this domain when the cpu was idle
>> -    5)  # of times in this domain pull_task() was called when the cpu
>> +    15) Total imbalance in load when the cpu was idle
>> +    16) Total imbalance in utilization when the cpu was idle
>> +    17) Total imbalance in number of tasks when the cpu was idle
>> +    18) Total imbalance due to misfit tasks when the cpu was idle
>> +    19) # of times in this domain pull_task() was called when the cpu
>>           was idle
>> -    6)  # of times in this domain pull_task() was called even though
>> +    20) # of times in this domain pull_task() was called even though
> 
> same comment for pull_task.
> 

Acked.

>>           the target task was cache-hot when idle
>> -    7)  # of times in this domain sched_balance_rq() was called but did
>> +    21) # of times in this domain sched_balance_rq() was called but did
>>           not find a busier queue while the cpu was idle
>> -    8)  # of times in this domain a busier queue was found while the
>> +    22) # of times in this domain a busier queue was found while the
>>           cpu was idle but no busier group was found
>> -    9)  # of times in this domain sched_balance_rq() was called when the
>> -        cpu was busy
>> -    10) # of times in this domain sched_balance_rq() checked but 
>> found the
>> -        load did not require balancing when busy
>> -    11) # of times in this domain sched_balance_rq() tried to move 
>> one or
>> -        more tasks and failed, when the cpu was busy
>> -    12) sum of imbalances discovered (if any) with each call to
>> -        sched_balance_rq() in this domain when the cpu was busy
>> -    13) # of times in this domain pull_task() was called when busy
>> -    14) # of times in this domain pull_task() was called even though the
>> -        target task was cache-hot when busy
>> -    15) # of times in this domain sched_balance_rq() was called but 
>> did not
>> -        find a busier queue while the cpu was busy
>> -    16) # of times in this domain a busier queue was found while the cpu
>> -        was busy but no busier group was found
>> -    17) # of times in this domain sched_balance_rq() was called when the
>> -        cpu was just becoming idle
>> -    18) # of times in this domain sched_balance_rq() checked but 
>> found the
>> +    23) # of times in this domain sched_balance_rq() was called when the
>> +        was just becoming idle
>> +    24) # of times in this domain sched_balance_rq() checked but 
>> found the
>>           load did not require balancing when the cpu was just 
>> becoming idle
>> -    19) # of times in this domain sched_balance_rq() tried to move 
>> one or more
>> +    25) # of times in this domain sched_balance_rq() tried to move 
>> one or more
>>           tasks and failed, when the cpu was just becoming idle
>> -    20) sum of imbalances discovered (if any) with each call to
>> -        sched_balance_rq() in this domain when the cpu was just 
>> becoming idle
>> -    21) # of times in this domain pull_task() was called when newly idle
>> -    22) # of times in this domain pull_task() was called even though the
>> +    26) Total imbalance in load when the cpu was just becoming idle
>> +    27) Total imbalance in utilization when the cpu was just becoming 
>> idle
>> +    28) Total imbalance in number of tasks when the cpu was just 
>> becoming idle
>> +    29) Total imbalance due to misfit tasks when the cpu was just 
>> becoming idle
>> +    30) # of times in this domain pull_task() was called when newly idle
>> +    31) # of times in this domain pull_task() was called even though the
>>           target task was cache-hot when just becoming idle
> 
> same comment for pull_task.
> 

Acked.

>> -    23) # of times in this domain sched_balance_rq() was called but 
>> did not
>> +    32) # of times in this domain sched_balance_rq() was called but 
>> did not
>>           find a busier queue while the cpu was just becoming idle
>> -    24) # of times in this domain a busier queue was found while the cpu
>> +    33) # of times in this domain a busier queue was found while the cpu
>>           was just becoming idle but no busier group was found
>>      Next three are active_load_balance() statistics:
>> -    25) # of times active_load_balance() was called
>> -    26) # of times active_load_balance() tried to move a task and failed
>> -    27) # of times active_load_balance() successfully moved a task
>> +    34) # of times active_load_balance() was called
>> +    35) # of times active_load_balance() tried to move a task and failed
>> +    36) # of times active_load_balance() successfully moved a task
>>      Next three are sched_balance_exec() statistics:
>> -    28) sbe_cnt is not used
>> -    29) sbe_balanced is not used
>> -    30) sbe_pushed is not used
>> +    37) sbe_cnt is not used
>> +    38) sbe_balanced is not used
>> +    39) sbe_pushed is not used
>>      Next three are sched_balance_fork() statistics:
>> -    31) sbf_cnt is not used
>> -    32) sbf_balanced is not used
>> -    33) sbf_pushed is not used
>> +    40) sbf_cnt is not used
>> +    41) sbf_balanced is not used
>> +    42) sbf_pushed is not used
>>      Next three are try_to_wake_up() statistics:
>> -    34) # of times in this domain try_to_wake_up() awoke a task that
>> +    43) # of times in this domain try_to_wake_up() awoke a task that
>>           last ran on a different cpu in this domain
>> -    35) # of times in this domain try_to_wake_up() moved a task to the
>> +    44) # of times in this domain try_to_wake_up() moved a task to the
>>           waking cpu because it was cache-cold on its own cpu anyway
>> -    36) # of times in this domain try_to_wake_up() started passive 
>> balancing
>> +    45) # of times in this domain try_to_wake_up() started passive 
>> balancing
>>   /proc/<pid>/schedstat
>>   ---------------------
>> diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c
>> index 5f563965976c..4346fd81c31f 100644
>> --- a/kernel/sched/stats.c
>> +++ b/kernel/sched/stats.c
>> @@ -103,7 +103,7 @@ void __update_stats_enqueue_sleeper(struct rq *rq, 
>> struct task_struct *p,
>>    * Bump this up when changing the output format or the meaning of an 
>> existing
>>    * format, so that tools can adapt (or abort)
>>    */
>> -#define SCHEDSTAT_VERSION 16
>> +#define SCHEDSTAT_VERSION 17
>>   static int show_schedstat(struct seq_file *seq, void *v)
>>   {
> 
--
Thanks and Regards,
Swapnil

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ