[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALvZod66KF-8xKB1dyY2twizDE=svE8iXT_nqvsrfWg1a92f4A@mail.gmail.com>
Date: Fri, 16 Jul 2021 05:55:42 -0700
From: Shakeel Butt <shakeelb@...gle.com>
To: Vasily Averin <vvs@...tuozzo.com>
Cc: Tejun Heo <tj@...nel.org>, Cgroups <cgroups@...r.kernel.org>,
Michal Hocko <mhocko@...nel.org>,
Johannes Weiner <hannes@...xchg.org>,
Vladimir Davydov <vdavydov.dev@...il.com>,
Roman Gushchin <guro@...com>,
Alexander Viro <viro@...iv.linux.org.uk>,
Alexey Dobriyan <adobriyan@...il.com>,
Andrei Vagin <avagin@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Borislav Petkov <bp@...en8.de>,
Christian Brauner <christian.brauner@...ntu.com>,
David Ahern <dsahern@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Dmitry Safonov <0x7f454c46@...il.com>,
Eric Dumazet <edumazet@...gle.com>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
Jakub Kicinski <kuba@...nel.org>,
"J. Bruce Fields" <bfields@...ldses.org>,
Jeff Layton <jlayton@...nel.org>, Jens Axboe <axboe@...nel.dk>,
Jiri Slaby <jirislaby@...nel.org>,
Kirill Tkhai <ktkhai@...tuozzo.com>,
Oleg Nesterov <oleg@...hat.com>,
Serge Hallyn <serge@...lyn.com>,
Thomas Gleixner <tglx@...utronix.de>,
Zefan Li <lizefan.x@...edance.com>,
netdev <netdev@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 00/16] memcg accounting from OpenVZ
On Thu, Jul 15, 2021 at 9:11 PM Vasily Averin <vvs@...tuozzo.com> wrote:
>
> On 7/15/21 8:11 PM, Shakeel Butt wrote:
> > On Tue, Apr 27, 2021 at 11:51 PM Vasily Averin <vvs@...tuozzo.com> wrote:
> >>
> >> OpenVZ uses memory accounting 20+ years since v2.2.x linux kernels.
> >> Initially we used our own accounting subsystem, then partially committed
> >> it to upstream, and a few years ago switched to cgroups v1.
> >> Now we're rebasing again, revising our old patches and trying to push
> >> them upstream.
> >>
> >> We try to protect the host system from any misuse of kernel memory
> >> allocation triggered by untrusted users inside the containers.
> >>
> >> Patch-set is addressed mostly to cgroups maintainers and cgroups@ mailing
> >> list, though I would be very grateful for any comments from maintainersi
> >> of affected subsystems or other people added in cc:
> >>
> >> Compared to the upstream, we additionally account the following kernel objects:
> >> - network devices and its Tx/Rx queues
> >> - ipv4/v6 addresses and routing-related objects
> >> - inet_bind_bucket cache objects
> >> - VLAN group arrays
> >> - ipv6/sit: ip_tunnel_prl
> >> - scm_fp_list objects used by SCM_RIGHTS messages of Unix sockets
> >> - nsproxy and namespace objects itself
> >> - IPC objects: semaphores, message queues and share memory segments
> >> - mounts
> >> - pollfd and select bits arrays
> >> - signals and posix timers
> >> - file lock
> >> - fasync_struct used by the file lease code and driver's fasync queues
> >> - tty objects
> >> - per-mm LDT
> >>
> >> We have an incorrect/incomplete/obsoleted accounting for few other kernel
> >> objects: sk_filter, af_packets, netlink and xt_counters for iptables.
> >> They require rework and probably will be dropped at all.
> >>
> >> Also we're going to add an accounting for nft, however it is not ready yet.
> >>
> >> We have not tested performance on upstream, however, our performance team
> >> compares our current RHEL7-based production kernel and reports that
> >> they are at least not worse as the according original RHEL7 kernel.
> >
> > Hi Vasily,
> >
> > What's the status of this series? I see a couple patches did get
> > acked/reviewed. Can you please re-send the series with updated ack
> > tags?
>
> Technically my patches does not have any NAKs. Practically they are still them merged.
> I've expected Michal will push it, but he advised me to push subsystem maintainers.
> I've asked Tejun to pick up the whole patch set and I'm waiting for his feedback right now.
>
> I can resend patch set once again, with collected approval and with rebase to v5.14-rc1.
> However I do not understand how it helps to push them if patches should be processed through
> subsystem maintainers. As far as I understand I'll need to split this patch set into
> per-subsystem pieces and sent them to corresponded maintainers.
>
Usually these kinds of patches (adding memcg accounting) go through mm
tree but if there are no dependencies between the patches and a
consensus that each subsystem maintainer picks the corresponding patch
then that is fine too.
Powered by blists - more mailing lists