linux-kernel - Re: [PATCH v2 3/3] mm: memcontrol: recursive memory.low protection

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200214135728.GK88887@mtj.thefacebook.com>
Date:   Fri, 14 Feb 2020 08:57:28 -0500
From:   Tejun Heo <tj@...nel.org>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Roman Gushchin <guro@...com>, linux-mm@...ck.org,
        cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
        kernel-team@...com
Subject: Re: [PATCH v2 3/3] mm: memcontrol: recursive memory.low protection

Hello,

On Fri, Feb 14, 2020 at 08:15:37AM +0100, Michal Hocko wrote:
> > Yes, it can set up the control knobs as directed but it doesn't ship
> > with any material resource configurations or has conventions set up
> > around it.
> 
> Right. But services might use those knobs, right? And that means that if
> somebody wants a memory protection then the service file is going to use 
> MemoryLow=$FOO and that is likely not going to work properly without an
> an additional hassles, e.g. propagate upwards, which systemd doesn't do
> unless I am mistaken.

While there are applications where strict protection makes sense, in a
lot of cases, resource decisions have to consider factors global to
the system - how much is there and for what purpose the system is
being set up. Static per-service configuration for sure doesn't work
and neither will dynamic configuration without considering system-wide
factors.

Another aspect is that as configuration gets more granular and
stricter with memory knobs, the configuration becomes less
work-conserving. Kernel's MM keeps track of dynamic behavior and adapt
to the dynamic usage, these configurations can't.

So, while individual applications may indicate what its resource
dispositions are, a working configuration is not gonna come from each
service declaring how many bytes they want.

This doesn't mean configurations are more tedious or difficult. In
fact, in a lot of cases, categorizing applications on the system
broadly and assigning ballpark weights and memory protections from the
higher level is sufficient.

> > > Besides that we are talking about memcg features which are available only
> > > unified hieararchy and that is what systemd is using already.
> > 
> > I'm not quite sure what the above sentence is trying to say.
> 
> I meant to say that once the unified hierarchy is used by systemd you
> cannot configure it differently to suit your needs without interfering
> with systemd.

I haven't experienced systemd getting in the way of structuring cgroup
hierarchy and configuring them. It's pretty flexible and easy to
configure. Do you have any specific constraints on mind?

> > There's a plan to integrate streamlined implementation of oomd into
> > systemd. There was a thread somewhere but the only thing I can find
> > now is a phoronix link.
> > 
> >   https://www.phoronix.com/scan.php?page=news_item&px=Systemd-Facebook-OOMD
> 
> I am not sure I see how that is going to change much wrt. resource
> distribution TBH. Is the existing cgroup hierarchy going to change for
> the OOMD to be deployed?

It's not a hard requirement but it'll be a lot more useful with actual
resource hierarchy. As more resource control features get enabled, I
think it'll converge that way because that's more useful.

> > Yeah, exactly, all it needs to do is placing scopes / services
> > according to resource hierarchy and configure overall policy at higher
> > level slices, which is exactly what the memory.low semantics change
> > will allow.
> 
> Let me ask more specifically. Is there any plan or existing API to allow
> to configure which services are related resource wise?

At kernel level, no. They seem like pretty high level policy decisions
to me.

> > > That being said, I do not really blame systemd here. We are not making
> > > their life particularly easy TBH.
> > 
> > Do you mind elaborating a bit?
> 
> I believe I have already expressed the configurability concern elsewhere
> in the email thread. It boils down to necessity to propagate
> protection all the way up the hierarchy properly if you really need to
> protect leaf cgroups that are organized without a resource control in
> mind. Which is what systemd does.

But that doesn't work for other controllers at all. I'm having a
difficult time imagining how making this one control mechanism work
that way makes sense. Memory protection has to be configured together
with IO protection to be actually effective.

As for cgroup hierarchy being unrelated to how controllers behave, it
frankly reminds me of cgroup1 memcg flat hierarchy thing I'm not sure
how that would actually work in terms of resource isolation. Also, I'm
not sure how systemd forces such configurations and I'd think systemd
folks would be happy to fix them if there are such problems. Is the
point you're trying to make "because of systemd, we have to contort
how memory controller behaves"?

Thanks.

-- 
tejun