lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 18 Jul 2018 08:26:45 -0700
From:   Shakeel Butt <shakeelb@...gle.com>
To:     bmerry@....ac.za
Cc:     Michal Hocko <mhocko@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux MM <linux-mm@...ck.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>
Subject: Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines

On Wed, Jul 18, 2018 at 7:29 AM Bruce Merry <bmerry@....ac.za> wrote:
>
> On 18 July 2018 at 12:42, Michal Hocko <mhocko@...nel.org> wrote:
> > [CC some more people]
> >
> > On Tue 17-07-18 21:23:07, Andrew Morton wrote:
> >> (cc linux-mm)
> >>
> >> On Tue, 3 Jul 2018 08:43:23 +0200 Bruce Merry <bmerry@....ac.za> wrote:
> >>
> >> > Hi
> >> >
> >> > I've run into an odd performance issue in the kernel, and not being a
> >> > kernel dev or knowing terribly much about cgroups, am looking for
> >> > advice on diagnosing the problem further (I discovered this while
> >> > trying to pin down high CPU load in cadvisor).
> >> >
> >> > On some machines in our production system, cat
> >> > /sys/fs/cgroup/memory/memory.stat is extremely slow (500ms on one
> >> > machine), while on other nominally identical machines it is fast
> >> > (2ms).
> >
> > Could you try to use ftrace to see where the time is spent?
>
> Thanks for looking into this. I'm not familiar with ftrace. Can you
> give me a specific command line to run? Based on "perf record cat
> /sys/fs/cgroup/memory/memory.stat"/"perf report", I see the following:
>
>   42.09%  cat      [kernel.kallsyms]  [k] memcg_stat_show
>   29.19%  cat      [kernel.kallsyms]  [k] memcg_sum_events.isra.22
>   12.41%  cat      [kernel.kallsyms]  [k] mem_cgroup_iter
>    5.42%  cat      [kernel.kallsyms]  [k] _find_next_bit
>    4.14%  cat      [kernel.kallsyms]  [k] css_next_descendant_pre
>    3.44%  cat      [kernel.kallsyms]  [k] find_next_bit
>    2.84%  cat      [kernel.kallsyms]  [k] mem_cgroup_node_nr_lru_pages
>

It seems like you are using cgroup-v1. How many nodes are there in
your memcg tree and also how many cpus does the system have?

Please note that memcg_stat_show or reading memory.stat in cgroup-v1
is not optimized as cgroup-v2. The function memcg_stat_show() in 4.13
does ~17 tree walks and then for ~12 of those tree walks, it goes
through all cpus for each node in the memcg tree. In 4.16,
a983b5ebee57 ("mm: memcontrol: fix excessive complexity in memory.stat
reporting") optimizes aways the cpu traversal at the expense of some
accuracy. Next optimization would be to do just one memcg tree
traversal similar to cgroup-v2's memory_stat_show().

Anyways, is it possible for you to try 4.16 kernel?

Shakeel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ