lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 19 Aug 2022 06:45:43 -1000
From:   Tejun Heo <tj@...nel.org>
To:     Yafang Shao <laoar.shao@...il.com>
Cc:     Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>, Martin Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        john fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>,
        Stanislav Fomichev <sdf@...gle.com>,
        Hao Luo <haoluo@...gle.com>, jolsa@...nel.org,
        Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...nel.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Shakeel Butt <shakeelb@...gle.com>,
        Muchun Song <songmuchun@...edance.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        lizefan.x@...edance.com, Cgroups <cgroups@...r.kernel.org>,
        netdev <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
        Linux MM <linux-mm@...ck.org>
Subject: Re: [PATCH bpf-next v2 00/12] bpf: Introduce selectable memcg for
 bpf map

Hello,

On Fri, Aug 19, 2022 at 08:59:20AM +0800, Yafang Shao wrote:
> On Fri, Aug 19, 2022 at 6:20 AM Tejun Heo <tj@...nel.org> wrote:
> > memcg folks would have better informed opinions but from generic cgroup pov
> > I don't think this is a good direction to take. This isn't a problem limited
> > to bpf progs and it doesn't make whole lot of sense to solve this for bpf.
> 
> This change is bpf specific. It doesn't refactor a whole lot of things.

I'm not sure what point the above sentence is making. It may not change a
lot of code but it does introduce significantly new mode of operation which
affects memcg and cgroup in general.

> > We have the exact same problem for any resources which span multiple
> > instances of a service including page cache, tmpfs instances and any other
> > thing which can persist longer than procss life time. My current opinion is
> > that this is best solved by introducing an extra cgroup layer to represent
> > the persistent entity and put the per-instance cgroup under it.
> 
> It is not practical on k8s.
> Because, before the persistent entity, the cgroup dir is stateless.
> After, it is stateful.
> Pls, don't continue keeping blind eyes on k8s.

Can you please elaborate why it isn't practical for k8s? I don't know the
details of k8s and what you wrote above is not a detailed enough technical
argument.

> > It does require reorganizing how things are organized from userspace POV but
> > the end result is really desirable. We get entities accurately representing
> > what needs to be tracked and control over the granularity of accounting and
> > control (e.g. folks who don't care about telling apart the current
> > instance's usage can simply not enable controllers at the persistent entity
> > level).
> 
> Pls.s also think about why k8s refuse to use cgroup2.

This attitude really bothers me. You aren't spelling it out fully but
instead of engaging in the technical argument at the hand, you're putting
forth conforming upstream to the current k8s's assumptions and behaviors as
a requirement and then insisting that it's upstream's fault that k8s is
staying with cgroup1.

This is not an acceptable form of argument and it would be irresponsible to
grant any kind weight to this line of reasoning. k8s may seem like the world
to you but it is one of many use cases of the upstream kernel. We all should
pay attention to the use cases and technical arguments to determine how we
chart our way forward, but being k8s or whoever else clearly isn't a waiver
to claim this kind of unilateral demand.

It's okay to emphasize the gravity of the specific use case at hand but
please realize that it's one of the many factors that should be considered
and sometimes one which can and should get trumped by others.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ