linux-kernel - Re: [PATCH v2 0/5] Introduce memcg_stock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <15c605f27f87d732e80e294f13fd9513697b65e3.camel@redhat.com>
Date:   Fri, 27 Jan 2023 04:35:22 -0300
From:   Leonardo Brás <leobras@...hat.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Roman Gushchin <roman.gushchin@...ux.dev>,
        Marcelo Tosatti <mtosatti@...hat.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Shakeel Butt <shakeelb@...gle.com>,
        Muchun Song <muchun.song@...ux.dev>,
        Andrew Morton <akpm@...ux-foundation.org>,
        cgroups@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining

On Fri, 2023-01-27 at 08:20 +0100, Michal Hocko wrote:
> On Fri 27-01-23 04:14:19, Leonardo Brás wrote:
> > On Thu, 2023-01-26 at 15:12 -0800, Roman Gushchin wrote:
> [...]
> > > I'd rather opt out of stock draining for isolated cpus: it might slightly reduce
> > > the accuracy of memory limits and slightly increase the memory footprint (all
> > > those dying memcgs...), but the impact will be limited. Actually it is limited
> > > by the number of cpus.
> > 
> > I was discussing this same idea with Marcelo yesterday morning.
> > 
> > The questions had in the topic were:
> > a - About how many pages the pcp cache will hold before draining them itself? 
> 
> MEMCG_CHARGE_BATCH (64 currently). And one more clarification. The cache
> doesn't really hold any pages. It is a mere counter of how many charges
> have been accounted for the memcg page counter. So it is not really
> consuming proportional amount of resources. It just pins the
> corresponding memcg. Have a look at consume_stock and refill_stock

I see. Thanks for pointing that out!

So in worst case scenario the memcg would have reserved 64 pages * (numcpus - 1)
that are not getting used, and may cause an 'earlier' OOM if this amount is
needed but can't be freed.

In the wave of worst case, supposing a big powerpc machine, 256 CPUs, each
holding 64k * 64 pages => 1GB memory - 4MB (one cpu using resources).
It's starting to get too big, but still ok for a machine this size.

The thing is that it can present an odd behavior: 
You have a cgroup created before, now empty, and try to run given application,
and hits OOM.
You then restart the cgroup, run the same application without an issue.

Even though it looks a good possibility, this can be perceived by user as
instability.

> 
> > b - Would it cache any kind of bigger page, or huge page in this same aspect?
> 
> The above should answer this as well as those following up I hope. If
> not let me know.

IIUC we are talking normal pages, is that it?

Best regards,
Leo