linux-kernel - Re: [PATCH] sched/debug: Show intergroup and hierarchy sum wait time of a task group

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHCio2jP7mZpE6sAb_Rewbc2nx4s0NY3sOYmsF69EqE4Ysbo=w@mail.gmail.com>
Date:   Mon, 28 Jan 2019 15:21:06 +0800
From:   禹舟键 <ufo19890607@...il.com>
To:     王贇 <yun.wang@...ux.alibaba.com>
Cc:     mingo@...hat.com, Peter Zijlstra <peterz@...radead.org>,
        Wind Yu <yuzhoujian@...ichuxing.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched/debug: Show intergroup and hierarchy sum wait time
 of a task group

Hi Michael
> Task competition inside a cgroup won't be considered as cgroup's
> competition, please try create another cgroup with dead loop on
> each CPU

Yes, you are right, but I don't think we just need to account for
cgroup's competition,
because this factor does not reflect cgroup internal conditions. We
still need a proper
method to evaluate CPU competition inside a cgroup.

> Running tasks doesn't means no competition, only if that cgroup occupied
> the CPU exclusively at that moment.

I care much about CPU competiton inside a cgroup. I can only read
'/proc/$pid/schedstat'
thousands of times to get every task's wait_sum time without cgroup
hierarchy wait_sum,
and it definitely tasks a real long time(40ms for 8000 tasks in a container).

> No offense but I'm afraid you misunderstand the problem we try to solve
> by wait_sum, if your purpose is to have a way to tell whether there are
> sufficient CPU inside a container, please try lxcfs + top, if there are
> almost no idle and load is high, then the CPU resource is not sufficient.

emmmm... Maybe I didn't make it clear.  We need to dynamically adjust the
number of CPUs for a container based on the running state of tasks inside
the container. If we find tasks' wait_sum are really high, we will add more
CPU cores to this container, or else we will decline some CPU to this container.
In a word, we want to ensure 'co-scheduling' for high priority containers.

>Frankly speaking this sounds like a supplement rather than a missing piece,
>although we don't rely on lxcfs and modify the kernel ourselves to support
>container environment, I still don't think such kind of solutions should be
>in kernel.

I don't care if this value is considered as a supplement or a missing piece. I
only care about how can I assess the running state inside a container. I think
lxcfs is really a good solution to improve the visibility of container
resources,
but it is not good enough at the moment.

/proc/cpuinfo
/proc/diskstats
/proc/meminfo
/proc/stat
/proc/swaps
/proc/uptime

we can read this procfs file inside a container,but this file still
cannot reflect
real-time information. Please think about the following scenario: a
'rabbit' process
will generate 2000 tasks in every 30ms, and these children tasks just run 1~5ms
and then exit. How can we detect this thrashing workload without
hierarchy wait_sum?

Thanks,
Yuzhoujian