[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YfgT/9tEREQNiiAN@cmpxchg.org>
Date: Mon, 31 Jan 2022 11:53:19 -0500
From: Johannes Weiner <hannes@...xchg.org>
To: Michal Hocko <mhocko@...e.com>
Cc: Waiman Long <longman@...hat.com>,
Vladimir Davydov <vdavydov.dev@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Petr Mladek <pmladek@...e.com>,
Steven Rostedt <rostedt@...dmis.org>,
Sergey Senozhatsky <senozhatsky@...omium.org>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Rasmus Villemoes <linux@...musvillemoes.dk>,
linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
linux-mm@...ck.org, Ira Weiny <ira.weiny@...el.com>,
Rafael Aquini <aquini@...hat.com>
Subject: Re: [PATCH v2 3/3] mm/page_owner: Dump memcg information
On Mon, Jan 31, 2022 at 10:38:51AM +0100, Michal Hocko wrote:
> On Sat 29-01-22 15:53:15, Waiman Long wrote:
> > It was found that a number of offlined memcgs were not freed because
> > they were pinned by some charged pages that were present. Even "echo
> > 1 > /proc/sys/vm/drop_caches" wasn't able to free those pages. These
> > offlined but not freed memcgs tend to increase in number over time with
> > the side effect that percpu memory consumption as shown in /proc/meminfo
> > also increases over time.
> >
> > In order to find out more information about those pages that pin
> > offlined memcgs, the page_owner feature is extended to dump memory
> > cgroup information especially whether the cgroup is offlined or not.
>
> It is not really clear to me how this is supposed to be used. Are you
> really dumping all the pages in the system to find out offline memcgs?
> That looks rather clumsy to me. I am not against adding memcg
> information to the page owner output. That can be useful in other
> contexts.
We've sometimes done exactly that in production, but with drgn
scripts. It's not very common, so it doesn't need to be very efficient
either. Typically, we'd encounter a host with an unusual number of
dying cgroups, ssh in and poke around with drgn to figure out what
kind of objects are still pinning the cgroups in question.
This patch would make that process a little easier, I suppose.
Powered by blists - more mailing lists