[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOm-9aqeKZ7+Jvhc5DxEEzbk4T0iQx8gZ=O1vy6YXnbOkncFsg@mail.gmail.com>
Date: Wed, 18 Jul 2018 16:29:20 +0200
From: Bruce Merry <bmerry@....ac.za>
To: Michal Hocko <mhocko@...nel.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Johannes Weiner <hannes@...xchg.org>,
Vladimir Davydov <vdavydov.dev@...il.com>
Subject: Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 18 July 2018 at 12:42, Michal Hocko <mhocko@...nel.org> wrote:
> [CC some more people]
>
> On Tue 17-07-18 21:23:07, Andrew Morton wrote:
>> (cc linux-mm)
>>
>> On Tue, 3 Jul 2018 08:43:23 +0200 Bruce Merry <bmerry@....ac.za> wrote:
>>
>> > Hi
>> >
>> > I've run into an odd performance issue in the kernel, and not being a
>> > kernel dev or knowing terribly much about cgroups, am looking for
>> > advice on diagnosing the problem further (I discovered this while
>> > trying to pin down high CPU load in cadvisor).
>> >
>> > On some machines in our production system, cat
>> > /sys/fs/cgroup/memory/memory.stat is extremely slow (500ms on one
>> > machine), while on other nominally identical machines it is fast
>> > (2ms).
>
> Could you try to use ftrace to see where the time is spent?
Thanks for looking into this. I'm not familiar with ftrace. Can you
give me a specific command line to run? Based on "perf record cat
/sys/fs/cgroup/memory/memory.stat"/"perf report", I see the following:
42.09% cat [kernel.kallsyms] [k] memcg_stat_show
29.19% cat [kernel.kallsyms] [k] memcg_sum_events.isra.22
12.41% cat [kernel.kallsyms] [k] mem_cgroup_iter
5.42% cat [kernel.kallsyms] [k] _find_next_bit
4.14% cat [kernel.kallsyms] [k] css_next_descendant_pre
3.44% cat [kernel.kallsyms] [k] find_next_bit
2.84% cat [kernel.kallsyms] [k] mem_cgroup_node_nr_lru_pages
> memory_stat_show should only scale with the depth of the cgroup
> hierarchy for memory.stat to get cumulative numbers. All the rest should
> be simply reads of gathered counters. There is no locking involved in
> the current kernel. What is the kernel version you are using, btw?
Ubuntu 16.04 with kernel 4.13.0-41-generic (so presumably includes
some Ubuntu special sauce).
Some new information: when this occurred on another machine I ran
"echo 2 > /proc/sys/vm/drop_caches" to drop the dentry cache, and
performance immediately improved. Unfortunately, I've not been able to
deliberately reproduce the issue. I've tried doing the following 10^7
times in a loop and while it inflates the dentry cache, it doesn't
cause any significant slowdown:
1. Create a temporary cgroup: mkdir /sys/fs/cgroup/memory/<name>.
2. stat /sys/fs/cgroup/memory/<name>/memory.stat
3. rmdir /sys/fs/cgroup/memory/<name>
I've also tried inflating the dentry cache just by stat-ing millions
of non-existent files, and again, no slowdown. So I'm not sure exactly
how dentry cache is related.
Regards
Bruce
--
Bruce Merry
Senior Science Processing Developer
SKA South Africa
Powered by blists - more mailing lists