lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YC2CKyaeF2bqvpMk@cmpxchg.org>
Date:   Wed, 17 Feb 2021 15:52:59 -0500
From:   Johannes Weiner <hannes@...xchg.org>
To:     Michal Koutný <mkoutny@...e.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Tejun Heo <tj@...nel.org>, Michal Hocko <mhocko@...e.com>,
        Roman Gushchin <guro@...com>,
        Shakeel Butt <shakeelb@...gle.com>, linux-mm@...ck.org,
        cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
        kernel-team@...com
Subject: Re: [PATCH v3 4/8] cgroup: rstat: support cgroup1

On Wed, Feb 17, 2021 at 06:42:32PM +0100, Michal Koutný wrote:
> Hello.
> 
> On Tue, Feb 09, 2021 at 11:33:00AM -0500, Johannes Weiner <hannes@...xchg.org> wrote:
> > @@ -1971,10 +1978,14 @@ int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask)
> >  	if (ret)
> >  		goto destroy_root;
> >  
> > -	ret = rebind_subsystems(root, ss_mask);
> > +	ret = cgroup_rstat_init(root_cgrp);
> Would it make sense to do cgroup_rstat_init() only if there's a subsys
> in ss_mask that makes use of rstat?
> (On legacy systems there could be individual hierarchy for each
> controller so the rstat space can be saved.)

It's possible, but I don't think worth the trouble.

It would have to be done from rebind_subsystems(), as remount can add
more subsystems to an existing cgroup1 root. That in turn means we'd
have to have separate init paths for cgroup1 and cgroup2.

While we split cgroup1 and cgroup2 paths where necessary in the code,
it's a significant maintenance burden and a not unlikely source of
subtle errors (see the recent 'fix swap undercounting in cgroup2').

In this case, we're talking about a relatively small data structure
and the overhead is per mountpoint. Comparatively, we're allocating
the full vmstats structures for cgroup1 groups which barely use them,
and cgroup1 softlimit tree structures for each cgroup2 group.

So I don't think it's a good tradeoff. Subtle bugs that require kernel
patches are more disruptive to the user experience than the amount of
memory in question here.

> > @@ -285,8 +285,6 @@ void __init cgroup_rstat_boot(void)
> >  
> >  	for_each_possible_cpu(cpu)
> >  		raw_spin_lock_init(per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu));
> > -
> > -	BUG_ON(cgroup_rstat_init(&cgrp_dfl_root.cgrp));
> >  }
> Regardless of the suggestion above, this removal obsoletes the comment
> cgroup_rstat_init:
> 
>          int cpu;
>  
> -        /* the root cgrp has rstat_cpu preallocated */
>          if (!cgrp->rstat_cpu) {
>                  cgrp->rstat_cpu = alloc_percpu(struct cgroup_rstat_cpu);

Oh, I'm not removing the init call, I'm merely moving it from
cgroup_rstat_boot() to cgroup_setup_root().

The default root group has statically preallocated percpu data before
and after this patch. See cgroup.c:

  static DEFINE_PER_CPU(struct cgroup_rstat_cpu, cgrp_dfl_root_rstat_cpu);

  /* the default hierarchy */
  struct cgroup_root cgrp_dfl_root = { .cgrp.rstat_cpu = &cgrp_dfl_root_rstat_cpu };
  EXPORT_SYMBOL_GPL(cgrp_dfl_root);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ