[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <89067792-2c39-bcf2-6a35-80cab101c5ac@linux.alibaba.com>
Date: Sun, 16 Jun 2019 19:57:01 +0800
From: Xunlei Pang <xlpang@...ux.alibaba.com>
To: Chris Down <chris@...isdown.name>
Cc: Roman Gushchin <guro@...com>, Michal Hocko <mhocko@...nel.org>,
Johannes Weiner <hannes@...xchg.org>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] memcg: Ignore unprotected parent in
mem_cgroup_protected()
Hi Chris,
On 2019/6/16 PM 6:37, Chris Down wrote:
> Hi Xunlei,
>
> Xunlei Pang writes:
>> docker and various types(different memory capacity) of containers
>> are managed by k8s, it's a burden for k8s to maintain those dynamic
>> figures, simply set "max" to key containers is always welcome.
>
> Right, setting "max" is generally a fine way of going about it.
>
>> Set "max" to docker also protects docker cgroup memory(as docker
>> itself has tasks) unnecessarily.
>
> That's not correct -- leaf memcgs have to _explicitly_ request memory
> protection. From the documentation:
>
> memory.low
>
> [...]
>
> Best-effort memory protection. If the memory usages of a
> cgroup and all its ancestors are below their low boundaries,
> the cgroup's memory won't be reclaimed unless memory can be
> reclaimed from unprotected cgroups.
>
> Note the part that the cgroup itself also must be within its low
> boundary, which is not implied simply by having ancestors that would
> permit propagation of protections.
>
> In this case, Docker just shouldn't request it for those Docker-related
> tasks, and they won't get any. That seems a lot simpler and more
> intuitive than special casing "0" in ancestors.
>
>> This patch doesn't take effect on any intermediate layer with
>> positive memory.min set, it requires all the ancestors having
>> 0 memory.min to work.
>>
>> Nothing special change, but more flexible to business deployment...
>
> Not so, this change is extremely "special". It violates the basic
> expectation that 0 means no possibility of propagation of protection,
> and I still don't see a compelling argument why Docker can't just set
> "max" in the intermediate cgroup and not accept any protection in leaf
> memcgs that it doesn't want protection for.
I got the reason, I'm using cgroup v1(with memory.min backport)
which permits tasks existent in "docker" cgroup.procs.
For cgroup v2, it's not a problem.
Thanks,
Xunlei
Powered by blists - more mailing lists