linux-kernel - Re: [RFC PATCH 1/8] memcg: Enable fine-grained control of over memory.high action

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200817161132.GA5171@chrisdown.name>
Date:   Mon, 17 Aug 2020 17:11:32 +0100
From:   Chris Down <chris@...isdown.name>
To:     Waiman Long <longman@...hat.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...nel.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Jonathan Corbet <corbet@....net>,
        Alexey Dobriyan <adobriyan@...il.com>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, cgroups@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: [RFC PATCH 1/8] memcg: Enable fine-grained control of over
 memory.high action

Waiman Long writes:
>On 8/17/20 10:30 AM, Chris Down wrote:
>>Astractly, I think this really overcomplicates the API a lot. If 
>>these are truly generally useful (and I think that remains to be 
>>demonstrated), they should be additions to the existing API, rather 
>>than a sidestep with prctl.
>This patchset is derived from customer requests. With existing API, I 
>suppose you mean the memory cgroup API. Right? The reason to use 
>prctl() is that there are users out there who want some kind of 
>per-process control instead of for a whole group of processes unless 
>the users try to create one cgroup per process which is not very 
>efficient.

If using one cgroup per process is inefficient, then that's what needs to be 
fixed. Making the API extremely complex to reason about for every user isn't a 
good compromise when we're talking about an already niche use case.

>>I also worry about some other more concrete things:
>>
>>1. Doesn't this allow unprivileged applications to potentially 
>>bypass    memory.high constraints set by a system administrator?
>The memory.high constraint is for triggering memory reclaim. The new 
>mitigation actions introduced by this patchset will only be applied if 
>memory reclaim alone fails to limit the physical memory consumption. 
>The current memory cgroup memory reclaim code will not be affected by 
>this patchset.

memory.high isn't only for triggering memory reclaim, it's also about active 
throttling when the application fails to come under. Fundamentally it's 
supposed to indicate the point at which we expect the application to either 
cooperate or get forcibly descheduled -- take a look at where we call 
schedule_timeout_killable.

I really struggle to think about how all of those things should interact in 
this patchset.

>>2. What's the purpose of PR_MEMACT_KILL, compared to memory.max?
>A user can use this to specify which processes are less important and 
>can be sacrificed first instead of the other more important ones in 
>case they are really in a OOM situation. IOW, users can specify the 
>order where OOM kills can happen.

You can already do that with something like oomd, which has way more 
flexibility than this. Why codify this in the kernel instead of in a userspace 
agent?