lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 8 Aug 2017 16:24:32 -0700 (PDT)
From:   David Rientjes <rientjes@...gle.com>
To:     Roman Gushchin <guro@...com>
cc:     linux-mm@...ck.org, Michal Hocko <mhocko@...nel.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
        Tejun Heo <tj@...nel.org>, kernel-team@...com,
        cgroups@...r.kernel.org, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [v4 4/4] mm, oom, docs: describe the cgroup-aware OOM killer

On Wed, 26 Jul 2017, Roman Gushchin wrote:

> +Cgroup-aware OOM Killer
> +~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Cgroup v2 memory controller implements a cgroup-aware OOM killer.
> +It means that it treats memory cgroups as first class OOM entities.
> +
> +Under OOM conditions the memory controller tries to make the best
> +choise of a victim, hierarchically looking for the largest memory
> +consumer. By default, it will look for the biggest task in the
> +biggest leaf cgroup.
> +
> +Be default, all cgroups have oom_priority 0, and OOM killer will
> +chose the largest cgroup recursively on each level. For non-root
> +cgroups it's possible to change the oom_priority, and it will cause
> +the OOM killer to look athe the priority value first, and compare
> +sizes only of cgroups with equal priority.
> +
> +But a user can change this behavior by enabling the per-cgroup
> +oom_kill_all_tasks option. If set, it causes the OOM killer treat
> +the whole cgroup as an indivisible memory consumer. In case if it's
> +selected as on OOM victim, all belonging tasks will be killed.
> +
> +Tasks in the root cgroup are treated as independent memory consumers,
> +and are compared with other memory consumers (e.g. leaf cgroups).
> +The root cgroup doesn't support the oom_kill_all_tasks feature.
> +
> +This affects both system- and cgroup-wide OOMs. For a cgroup-wide OOM
> +the memory controller considers only cgroups belonging to the sub-tree
> +of the OOM'ing cgroup.
> +
>  IO
>  --

Thanks very much for following through with this.

As described in http://marc.info/?l=linux-kernel&m=149980660611610 this is 
very similar to what we do for priority based oom killing.

I'm wondering your comments on extending it one step further, however: 
include process priority as part of the selection rather than simply memcg 
priority.

memory.oom_priority will dictate which memcg the kill will originate from, 
but processes have no ability to specify that they should actually be 
killed as opposed to a leaf memcg.  I'm not sure how important this is for 
your usecase, but we have found it useful to be able to specify process 
priority as part of the decisionmaking.

At each level of consideration, we simply kill a process with lower 
/proc/pid/oom_priority if there are no memcgs with a lower 
memory.oom_priority.  This allows us to define the exact process that will 
be oom killed, absent oom_kill_all_tasks, and not require that the process 
be attached to leaf memcg.  Most notably these are processes that are best 
effort: stats collection, logging, etc.

Do you think it would be helpful to introduce per-process oom priority as 
well?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ