lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 18 Aug 2022 12:20:33 -1000
From:   Tejun Heo <tj@...nel.org>
To:     Yafang Shao <laoar.shao@...il.com>
Cc:     ast@...nel.org, daniel@...earbox.net, andrii@...nel.org,
        kafai@...com, songliubraving@...com, yhs@...com,
        john.fastabend@...il.com, kpsingh@...nel.org, sdf@...gle.com,
        haoluo@...gle.com, jolsa@...nel.org, hannes@...xchg.org,
        mhocko@...nel.org, roman.gushchin@...ux.dev, shakeelb@...gle.com,
        songmuchun@...edance.com, akpm@...ux-foundation.org,
        lizefan.x@...edance.com, cgroups@...r.kernel.org,
        netdev@...r.kernel.org, bpf@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH bpf-next v2 00/12] bpf: Introduce selectable memcg for
 bpf map

Hello,

On Thu, Aug 18, 2022 at 02:31:06PM +0000, Yafang Shao wrote:
> After switching to memcg-based bpf memory accounting to limit the bpf
> memory, some unexpected issues jumped out at us.
> 1. The memory usage is not consistent between the first generation and
> new generations.
> 2. After the first generation is destroyed, the bpf memory can't be
> limited if the bpf maps are not preallocated, because they will be
> reparented.
> 
> This patchset tries to resolve these issues by introducing an
> independent memcg to limit the bpf memory.

memcg folks would have better informed opinions but from generic cgroup pov
I don't think this is a good direction to take. This isn't a problem limited
to bpf progs and it doesn't make whole lot of sense to solve this for bpf.

We have the exact same problem for any resources which span multiple
instances of a service including page cache, tmpfs instances and any other
thing which can persist longer than procss life time. My current opinion is
that this is best solved by introducing an extra cgroup layer to represent
the persistent entity and put the per-instance cgroup under it.

It does require reorganizing how things are organized from userspace POV but
the end result is really desirable. We get entities accurately representing
what needs to be tracked and control over the granularity of accounting and
control (e.g. folks who don't care about telling apart the current
instance's usage can simply not enable controllers at the persistent entity
level).

We surely can discuss other approaches but my current intuition is that it'd
be really difficult to come up with a better solution than layering to
introduce persistent service entities.

So, please consider the approach nacked for the time being.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ