lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <81d051ee-c428-5360-b459-a4902904d237@linux.alibaba.com>
Date:   Mon, 9 Jul 2018 17:12:33 +0800
From:   王贇 <yun.wang@...ux.alibaba.com>
To:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] tg: show the sum wait time of an task group



On 2018/7/4 上午11:27, 王贇 wrote:
> Although we can rely on cpuacct to present the cpu usage of task
> group, it is hard to tell how intense the competition is between
> these groups on cpu resources.
> 
> Monitoring the wait time of each process or sched_debug could cost
> too much, and there is no good way to accurately represent the
> conflict with these info, we need the wait time on group dimension.
> 
> Thus we introduced group's wait_sum represent the conflict between
> task groups, which is simply sum the wait time of group's cfs_rq.
> 
> The 'cpu.stat' is modified to show the statistic, like:
> 
>    nr_periods 0
>    nr_throttled 0
>    throttled_time 0
>    wait_sum 2035098795584
> 
> Now we can monitor the changing on wait_sum to tell how suffering
> a task group is in the fight of cpu resources.
> 
> For example:
>    (wait_sum - last_wait_sum) * 100 / (nr_cpu * period_ns) == X%
> 
> means the task group paid X percentage of period on waiting
> for the cpu.

Hi, Peter

How do you think about this proposal?

There are situation that tasks in some group suffered much more
than others, will be good to have some way to easily locate them.

Regards,
Michael Wang

> 
> Signed-off-by: Michael Wang <yun.wang@...ux.alibaba.com>
> ---
> 
> Since v1:
>    Use schedstat_val to avoid compile error
>    Check and skip root_task_group
> 
>   kernel/sched/core.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 78d8fac..80ab995 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6781,6 +6781,8 @@ static int __cfs_schedulable(struct task_group *tg, u64 period, u64 quota)
> 
>   static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
>   {
> +    int i;
> +    u64 ws = 0;
>       struct task_group *tg = css_tg(seq_css(sf));
>       struct cfs_bandwidth *cfs_b = &tg->cfs_bandwidth;
> 
> @@ -6788,6 +6790,12 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
>       seq_printf(sf, "nr_throttled %d\n", cfs_b->nr_throttled);
>       seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time);
> 
> +    if (schedstat_enabled() && tg != &root_task_group) {
> +        for_each_possible_cpu(i)
> +            ws += schedstat_val(tg->se[i]->statistics.wait_sum);
> +        seq_printf(sf, "wait_sum %llu\n", ws);
> +    }
> +
>       return 0;
>   }
>   #endif /* CONFIG_CFS_BANDWIDTH */

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ