lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <207ef46c-672c-27c8-2012-735bd692a6de@linux.alibaba.com>
Date:   Wed, 27 Nov 2019 09:48:44 +0800
From:   王贇 <yun.wang@...ux.alibaba.com>
To:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Luis Chamberlain <mcgrof@...nel.org>,
        Kees Cook <keescook@...omium.org>,
        Iurii Zaikin <yzaikin@...gle.com>,
        Michal Koutný <mkoutny@...e.com>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-doc@...r.kernel.org,
        "Paul E. McKenney" <paulmck@...ux.ibm.com>
Subject: [PATCH v2 0/3] sched/numa: introduce advanced numa statistic

Since v1:
  * Improved documentation

Modern production environment could use hundreds of cgroup to control
the resources for different workloads, along with the complicated
resource binding.

On NUMA platforms where we have multiple nodes, things become even more
complicated, we hope there are more local memory access to improve the
performance, and NUMA Balancing keep working hard to achieve that,
however, wrong memory policy or node binding could easily waste the
effort, result a lot of remote page accessing.

We need to perceive such problems, then we got chance to fix it before
there are too much damages, however, there are no good approach yet to
help catch the mouse who introduced the remote access.

This patch set is trying to fill in the missing pieces, by introduce
the per-cgroup NUMA locality/exectime statistics, and expose the per-task
page migration failure counter, with these statistics, we could achieve
the daily monitoring on NUMA efficiency, to give warning when things going
too wrong.

Please check the third patch for more details.

Thanks to Peter, Mel and Michal for the good advice.

Michael Wang (3):
  sched/numa: advanced per-cgroup numa statistic
  sched/numa: expose per-task pages-migration-failure counter
  sched/numa: documentation for per-cgroup numa stat

 Documentation/admin-guide/cg-numa-stat.rst      | 163 ++++++++++++++++++++++++
 Documentation/admin-guide/index.rst             |   1 +
 Documentation/admin-guide/kernel-parameters.txt |   4 +
 Documentation/admin-guide/sysctl/kernel.rst     |   9 ++
 include/linux/sched.h                           |  18 ++-
 include/linux/sched/sysctl.h                    |   6 +
 init/Kconfig                                    |   9 ++
 kernel/sched/core.c                             |  91 +++++++++++++
 kernel/sched/debug.c                            |   1 +
 kernel/sched/fair.c                             |  33 +++++
 kernel/sched/sched.h                            |  17 +++
 kernel/sysctl.c                                 |  11 ++
 12 files changed, 362 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/admin-guide/cg-numa-stat.rst

-- 
2.14.4.44.g2045bb6

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ