[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 2 Feb 2022 11:29:36 -0500
From: Waiman Long <longman@...hat.com>
To: Michal Hocko <mhocko@...e.com>
Cc: Roman Gushchin <guro@...com>, Johannes Weiner <hannes@...xchg.org>,
Vladimir Davydov <vdavydov.dev@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Petr Mladek <pmladek@...e.com>,
Steven Rostedt <rostedt@...dmis.org>,
Sergey Senozhatsky <senozhatsky@...omium.org>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Rasmus Villemoes <linux@...musvillemoes.dk>,
linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
linux-mm@...ck.org, Ira Weiny <ira.weiny@...el.com>,
Rafael Aquini <aquini@...hat.com>
Subject: Re: [PATCH v2 3/3] mm/page_owner: Dump memcg information
On 2/2/22 03:57, Michal Hocko wrote:
> On Tue 01-02-22 11:41:19, Waiman Long wrote:
>> On 2/1/22 05:49, Michal Hocko wrote:
> [...]
>>> Could you be more specific? Offlined memcgs are still part of the
>>> hierarchy IIRC. So it shouldn't be much more than iterating the whole
>>> cgroup tree and collect interesting data about dead cgroups.
>> What I mean is that without piggybacking on top of page_owner, we will to
>> add a lot more code to collect and display those information which may have
>> some overhead of its own.
> Yes, there is nothing like a free lunch. Page owner is certainly a tool
> that can be used. My main concern is that this tool doesn't really
> scale on large machines with a lots of memory. It will provide a very
> detailed information but I am not sure this is particularly helpful to
> most admins (why should people process tons of allocation backtraces in
> the first place). Wouldn't it be sufficient to have per dead memcg stats
> to see where the memory sits?
>
> Accumulated offline memcgs is something that bothers more people and I
> am really wondering whether we can do more for those people to evaluate
> the current state.
You won't get the stack backtrace information without page_owner
enabled. I believe that is a helpful piece of information. I don't
expect page_owner to be enabled by default on production system because
of its memory overhead.
I believe you can actually see the number of memory cgroups present by
looking at the /proc/cgroups file. Though, you don't know how many of
them are offline memcgs. So if one suspect that there are a large number
of offline memcgs, one can set up a test environment with page_owner
enabled for further analysis.
Cheers,
Longman
Powered by blists - more mailing lists