linux-kernel - Re: [PATCH v4] mm/memcg: try harder to decrease [memory,memsw].limit_in

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALvZod6y8EfQt02+rNOP_JXgzpJJHjuVzd++T3E=NEMwwBv_CQ@mail.gmail.com>
Date:   Fri, 12 Jan 2018 14:57:35 -0800
From:   Shakeel Butt <shakeelb@...gle.com>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Cgroups <cgroups@...r.kernel.org>, Linux MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4] mm/memcg: try harder to decrease [memory,memsw].limit_in_bytes

On Fri, Jan 12, 2018 at 4:24 AM, Michal Hocko <mhocko@...nel.org> wrote:
> On Fri 12-01-18 00:59:38, Andrey Ryabinin wrote:
>> On 01/11/2018 07:29 PM, Michal Hocko wrote:
> [...]
>> > I do not think so. Consider that this reclaim races with other
>> > reclaimers. Now you are reclaiming a large chunk so you might end up
>> > reclaiming more than necessary. SWAP_CLUSTER_MAX would reduce the over
>> > reclaim to be negligible.
>> >
>>
>> I did consider this. And I think, I already explained that sort of race in previous email.
>> Whether "Task B" is really a task in cgroup or it's actually a bunch of reclaimers,
>> doesn't matter. That doesn't change anything.
>
> I would _really_ prefer two patches here. The first one removing the
> hard coded reclaim count. That thing is just dubious at best. If you
> _really_ think that the higher reclaim target is meaningfull then make
> it a separate patch. I am not conviced but I will not nack it it either.
> But it will make our life much easier if my over reclaim concern is
> right and we will need to revert it. Conceptually those two changes are
> independent anywa.
>

Personally I feel that the cgroup-v2 semantics are much cleaner for
setting limit. There is no race with the allocators in the memcg,
though oom-killer can be triggered. For cgroup-v1, the user does not
expect OOM killer and EBUSY is expected on unsuccessful reclaim. How
about we do something similar here and make sure oom killer can not be
triggered for the given memcg?

// pseudo code
disable_oom(memcg)
old = xchg(&memcg->memory.limit, requested_limit)

reclaim memory until usage gets below new limit or retries are exhausted

if (unsuccessful) {
  reset_limit(memcg, old)
  ret = EBUSY
} else
  ret = 0;
enable_oom(memcg)

This way there is no race with the allocators and oom killer will not
be triggered. The processes in the memcg can suffer but that should be
within the expectation of the user. One disclaimer though, disabling
oom for memcg needs more thought.

Shakeel