[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yog4jCygrYPtPXg5@carbon>
Date: Fri, 20 May 2022 17:55:40 -0700
From: Roman Gushchin <roman.gushchin@...ux.dev>
To: Vasily Averin <vvs@...nvz.org>
Cc: Michal Koutný <mkoutny@...e.com>,
Shakeel Butt <shakeelb@...gle.com>, kernel@...nvz.org,
linux-kernel@...r.kernel.org, Vlastimil Babka <vbabka@...e.cz>,
Michal Hocko <mhocko@...e.com>, cgroups@...r.kernel.org
Subject: Re: [PATCH 3/4] memcg: enable accounting for struct cgroup
On Fri, May 20, 2022 at 11:16:32PM +0300, Vasily Averin wrote:
> On 5/20/22 10:24, Vasily Averin wrote:
> > On 5/19/22 19:53, Michal Koutný wrote:
> >> On Fri, May 13, 2022 at 06:52:12PM +0300, Vasily Averin <vvs@...nvz.org> wrote:
> >>> Creating each new cgroup allocates 4Kb for struct cgroup. This is the
> >>> largest memory allocation in this scenario and is epecially important
> >>> for small VMs with 1-2 CPUs.
> >>
> >> What do you mean by this argument?
> >>
> >> (On bigger irons, the percpu components becomes dominant, e.g. struct
> >> cgroup_rstat_cpu.)
> >
> > Michal, Shakeel,
> > thank you very much for your feedback, it helps me understand how to improve
> > the methodology of my accounting analyze.
> > I considered the general case and looked for places of maximum memory allocations.
> > Now I think it would be better to split all called allocations into:
> > - common part, called for any cgroup type (i.e. cgroup_mkdir and cgroup_create),
> > - per-cgroup parts,
> > and focus on 2 corner cases: for single CPU VMs and for "big irons".
> > It helps to clarify which allocations are accounting-important and which ones
> > can be safely ignored.
> >
> > So right now I'm going to redo the calculations and hope it doesn't take long.
>
> common part: ~11Kb + 318 bytes percpu
> memcg: ~17Kb + 4692 bytes percpu
> cpu: ~2.5Kb + 1036 bytes percpu
> cpuset: ~3Kb + 12 bytes percpu
> blkcg: ~3Kb + 12 bytes percpu
> pid: ~1.5Kb + 12 bytes percpu
> perf: ~320b + 60 bytes percpu
> -------------------------------------------
> total: ~38Kb + 6142 bytes percpu
> currently accounted: 4668 bytes percpu
>
> Results:
> a) I'll add accounting for cgroup_rstat_cpu and psi_group_cpu,
> they are called in common part and consumes 288 bytes percpu.
> b) It makes sense to add accounting for simple_xattr(), as Michal recommend,
> especially because it can grow over 4kb
> c) it looks like the rest of the allocations can be ignored
>
> Details are below
> ('=' -- already accounted, '+' -- to be accounted, '~' -- see KERNFS, '?' -- perhaps later )
>
> common part:
> 16 ~ 352 5632 5632 KERNFS (*)
> 1 + 4096 4096 9728 (cgroup_mkdir+0xe4)
> 1 584 584 10312 (radix_tree_node_alloc.constprop.0+0x89)
> 1 192 192 10504 (__d_alloc+0x29)
> 2 72 144 10648 (avc_alloc_node+0x27)
> 2 64 128 10776 (percpu_ref_init+0x6a)
> 1 64 64 10840 (memcg_list_lru_alloc+0x21a)
>
> 1 + 192 192 192 call_site=psi_cgroup_alloc+0x1e
> 1 + 96 96 288 call_site=cgroup_rstat_init+0x5f
> 2 12 24 312 call_site=percpu_ref_init+0x23
> 1 6 6 318 call_site=__percpu_counter_init+0x22
I'm curios, how do you generate these data?
Just an idea: it could be a nice tool, placed somewhere in tools/cgroup/...
Thanks!
Powered by blists - more mailing lists