linux-kernel - Re: [PATCH] mm/memcg: support control THP behaviour in cgroup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YntkEUKPquTbBjMu@dhcp22.suse.cz>
Date:   Wed, 11 May 2022 09:21:53 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     CGEL <cgel.zte@...il.com>
Cc:     akpm@...ux-foundation.org, hannes@...xchg.org, willy@...radead.org,
        shy828301@...il.com, roman.gushchin@...ux.dev, shakeelb@...gle.com,
        linmiaohe@...wei.com, william.kucharski@...cle.com,
        peterx@...hat.com, hughd@...gle.com, vbabka@...e.cz,
        songmuchun@...edance.com, surenb@...gle.com,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        cgroups@...r.kernel.org, Yang Yang <yang.yang29@....com.cn>
Subject: Re: [PATCH] mm/memcg: support control THP behaviour in cgroup

On Wed 11-05-22 01:59:52, CGEL wrote:
> On Tue, May 10, 2022 at 03:36:34PM +0200, Michal Hocko wrote:
[...]
> > Can you come up with a sane hierarchical behavior?
> >
> 
> I think this new interface better be independent not hierarchical anyway. Especially
> when we treat container as lightweight virtual machine.

I suspect you are focusing too much on your usecase and do not realize
wider consequences of this being an user interface that still has to be
sensible for other usecases. Take a delagation of the control to
subgroups as an example. If this is a per memcg knob (like swappiness)
then children can override parent's THP policy. This might be a less of
the deal for swappiness because the anon/file reclaim balancing should
be mostly an internal thing. But THP policy is different because it has
other effects to workloads running outside of the said cgroup - higher
memory demand, higher contention for high-order memory etc.

I do not really see how this could be a sensible per-memcg policy
without being fully hierarchical.

> 
> > [...]
> > > > > For micro-service architecture, the application in one container is not a
> > > > > set of loosely tight processes, it's aim at provide one certain service,
> > > > > so different containers means different service, and different service
> > > > > has different QoS demand. 
> > > > 
> > > > OK, if they are tightly coupled you could apply the same THP policy by
> > > > an existing prctl interface. Why is that not feasible. As you are noting
> > > > below...
> > > > 
> > > > >     5.containers usually managed by compose software, which treats container as
> > > > > base management unit;
> > > > 
> > > > ..so the compose software can easily start up the workload by using prctl
> > > > to disable THP for whatever workloads it is not suitable for.
> > > 
> > > prctl(PR_SET_THP_DISABLE..) can not be elegance to support the semantic we
> > > need. If only some containers needs THP, other containers and host do not need
> > > THP. We must set host THP to always first, and call prctl() to close THP for
> > > host tasks and other containers one by one,
> > 
> > It might not be the most elegant solution but it should work.
> 
> So you agree it's reasonable to set THP policy for process in container, right?

Yes, like in any other processes.

> If so, IMHO, when there are thousands of processes launch and die on the machine,
> it will be horrible to do so by calling prctl(), I don't see the reasonability.

Could you be more specific? The usual prctl use would be normally
handled by the launcher and rely on the per-process policy to be
inherited down the road.

-- 
Michal Hocko
SUSE Labs