lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALvZod79PDmPLYAXm=EQDrn8mQfE9aQL+Mgaai6zu=uqucbMAQ@mail.gmail.com>
Date:   Wed, 7 Apr 2021 08:42:31 -0700
From:   Shakeel Butt <shakeelb@...gle.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Bharata B Rao <bharata@...ux.ibm.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux MM <linux-mm@...ck.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>
Subject: Re: High kmalloc-32 slab cache consumption with 10k containers

On Wed, Apr 7, 2021 at 4:55 AM Michal Hocko <mhocko@...e.com> wrote:
>
> On Mon 05-04-21 11:18:48, Bharata B Rao wrote:
> > Hi,
> >
> > When running 10000 (more-or-less-empty-)containers on a bare-metal Power9
> > server(160 CPUs, 2 NUMA nodes, 256G memory), it is seen that memory
> > consumption increases quite a lot (around 172G) when the containers are
> > running. Most of it comes from slab (149G) and within slab, the majority of
> > it comes from kmalloc-32 cache (102G)
>
> Is this 10k cgroups a testing enviroment or does anybody really use that
> in production? I would be really curious to hear how that behaves when
> those containers are not idle. E.g. global memory reclaim iterating over
> 10k memcgs will likely be very visible. I do remember playing with
> similar setups few years back and the overhead was very high.
> --

I can tell about our environment. Couple of thousands of memcgs (~2k)
are very normal on our machines as machines can be running 100+ jobs
(and each job can manage their own sub-memcgs). However each job can
have a high number of mounts. There is no local disk and each package
of the job is remotely mounted (a bit more complicated).

We do have issues with global memory reclaim but mostly the proactive
reclaim makes the global reclaim a tail issue (and at tail it often
does create havoc).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ