[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <029147be35b5173d5eb10c182e124ac9d2f1f0ba.camel@redhat.com>
Date: Fri, 27 Jan 2023 16:29:37 -0300
From: Leonardo Brás <leobras@...hat.com>
To: Michal Hocko <mhocko@...e.com>
Cc: Roman Gushchin <roman.gushchin@...ux.dev>,
Marcelo Tosatti <mtosatti@...hat.com>,
Johannes Weiner <hannes@...xchg.org>,
Shakeel Butt <shakeelb@...gle.com>,
Muchun Song <muchun.song@...ux.dev>,
Andrew Morton <akpm@...ux-foundation.org>,
cgroups@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining
On Fri, 2023-01-27 at 10:29 +0100, Michal Hocko wrote:
> On Fri 27-01-23 04:35:22, Leonardo Brás wrote:
> > On Fri, 2023-01-27 at 08:20 +0100, Michal Hocko wrote:
> > > On Fri 27-01-23 04:14:19, Leonardo Brás wrote:
> > > > On Thu, 2023-01-26 at 15:12 -0800, Roman Gushchin wrote:
> > > [...]
> > > > > I'd rather opt out of stock draining for isolated cpus: it might slightly reduce
> > > > > the accuracy of memory limits and slightly increase the memory footprint (all
> > > > > those dying memcgs...), but the impact will be limited. Actually it is limited
> > > > > by the number of cpus.
> > > >
> > > > I was discussing this same idea with Marcelo yesterday morning.
> > > >
> > > > The questions had in the topic were:
> > > > a - About how many pages the pcp cache will hold before draining them itself?
> > >
> > > MEMCG_CHARGE_BATCH (64 currently). And one more clarification. The cache
> > > doesn't really hold any pages. It is a mere counter of how many charges
> > > have been accounted for the memcg page counter. So it is not really
> > > consuming proportional amount of resources. It just pins the
> > > corresponding memcg. Have a look at consume_stock and refill_stock
> >
> > I see. Thanks for pointing that out!
> >
> > So in worst case scenario the memcg would have reserved 64 pages * (numcpus - 1)
>
> s@...cpus@..._isolated_cpus@
I was thinking worst case scenario being (ncpus - 1) being isolated.
>
> > that are not getting used, and may cause an 'earlier' OOM if this amount is
> > needed but can't be freed.
>
> s@OOM@...cg OOM@
> > In the wave of worst case, supposing a big powerpc machine, 256 CPUs, each
> > holding 64k * 64 pages => 1GB memory - 4MB (one cpu using resources).
> > It's starting to get too big, but still ok for a machine this size.
>
> It is more about the memcg limit rather than the size of the machine.
> Again, let's focus on actual usacase. What is the usual memcg setup with
> those isolcpus
I understand it's about the limit, not actually allocated memory. When I point
the machine size, I mean what is expected to be acceptable from a user in that
machine.
>
> > The thing is that it can present an odd behavior:
> > You have a cgroup created before, now empty, and try to run given application,
> > and hits OOM.
>
> The application would either consume those cached charges or flush them
> if it is running in a different memcg. Or what do you have in mind?
1 - Create a memcg with a VM inside, multiple vcpus pinned to isolated cpus.
2 - Run multi-cpu task inside the VM, it allocates memory for every CPU and keep
the pcp cache
3 - Try to run a single-cpu task (pinned?) inside the VM, which uses almost all
the available memory.
4 - memcg OOM.
Does it make sense?
>
> > You then restart the cgroup, run the same application without an issue.
> >
> > Even though it looks a good possibility, this can be perceived by user as
> > instability.
> >
> > >
> > > > b - Would it cache any kind of bigger page, or huge page in this same aspect?
> > >
> > > The above should answer this as well as those following up I hope. If
> > > not let me know.
> >
> > IIUC we are talking normal pages, is that it?
>
> We are talking about memcg charges and those have page granularity.
>
Thanks for the info!
Also, thanks for the feedback!
Leo
Powered by blists - more mailing lists