lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 12 May 2022 15:31:34 -0700 From: Suren Baghdasaryan <surenb@...gle.com> To: Gang Li <ligang.bdlg@...edance.com>, Michal Hocko <mhocko@...e.com> Cc: Andrew Morton <akpm@...ux-foundation.org>, Muchun Song <songmuchun@...edance.com>, hca@...ux.ibm.com, gor@...ux.ibm.com, agordeev@...ux.ibm.com, borntraeger@...ux.ibm.com, svens@...ux.ibm.com, "Eric W. Biederman" <ebiederm@...ssion.com>, Kees Cook <keescook@...omium.org>, Al Viro <viro@...iv.linux.org.uk>, Steven Rostedt <rostedt@...dmis.org>, Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>, acme@...nel.org, mark.rutland@....com, alexander.shishkin@...ux.intel.com, jolsa@...nel.org, namhyung@...nel.org, David Hildenbrand <david@...hat.com>, imbrenda@...ux.ibm.com, apopple@...dia.com, Alexey Dobriyan <adobriyan@...il.com>, stephen.s.brennan@...cle.com, ohoono.kwon@...sung.com, haolee.swjtu@...il.com, Kalesh Singh <kaleshsingh@...gle.com>, zhengqi.arch@...edance.com, Peter Xu <peterx@...hat.com>, Yang Shi <shy828301@...il.com>, Colin Cross <ccross@...gle.com>, vincent.whitchurch@...s.com, Thomas Gleixner <tglx@...utronix.de>, bigeasy@...utronix.de, fenghua.yu@...el.com, linux-s390@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>, linux-mm <linux-mm@...ck.org>, linux-fsdevel <linux-fsdevel@...r.kernel.org>, linux-perf-users@...r.kernel.org Subject: Re: [PATCH 0/5 v1] mm, oom: Introduce per numa node oom for CONSTRAINT_MEMORY_POLICY On Wed, May 11, 2022 at 9:47 PM Gang Li <ligang.bdlg@...edance.com> wrote: > > TLDR: > If a mempolicy is in effect(oc->constraint == CONSTRAINT_MEMORY_POLICY), out_of_memory() will > select victim on specific node to kill. So that kernel can avoid accidental killing on NUMA system. > > Problem: > Before this patch series, oom will only kill the process with the highest memory usage. > by selecting process with the highest oom_badness on the entire system to kill. > > This works fine on UMA system, but may have some accidental killing on NUMA system. > > As shown below, if process c.out is bind to Node1 and keep allocating pages from Node1, > a.out will be killed first. But killing a.out did't free any mem on Node1, so c.out > will be killed then. > > A lot of our AMD machines have 8 numa nodes. In these systems, there is a greater chance > of triggering this problem. > > OOM before patches: > ``` > Per-node process memory usage (in MBs) > PID Node 0 Node 1 Total > ----------- ---------- ------------- ---------- > 3095 a.out 3073.34 0.11 3073.45(Killed first. Maximum memory consumption) > 3199 b.out 501.35 1500.00 2001.35 > 3805 c.out 1.52 (grow)2248.00 2249.52(Killed then. Node1 is full) > ----------- ---------- ------------- ---------- > Total 3576.21 3748.11 7324.31 > ``` > > Solution: > We store per node rss in mm_rss_stat for each process. > > If a page allocation with mempolicy in effect(oc->constraint == CONSTRAINT_MEMORY_POLICY) > triger oom. We will calculate oom_badness with rss counter for the corresponding node. Then > select the process with the highest oom_badness on the corresponding node to kill. > > OOM after patches: > ``` > Per-node process memory usage (in MBs) > PID Node 0 Node 1 Total > ----------- ---------- ------------- ---------- > 3095 a.out 3073.34 0.11 3073.45 > 3199 b.out 501.35 1500.00 2001.35 > 3805 c.out 1.52 (grow)2248.00 2249.52(killed) > ----------- ---------- ------------- ---------- > Total 3576.21 3748.11 7324.31 > ``` You included lots of people but missed Michal Hocko. CC'ing him and please include him in the future postings. > > Gang Li (5): > mm: add a new parameter `node` to `get/add/inc/dec_mm_counter` > mm: add numa_count field for rss_stat > mm: add numa fields for tracepoint rss_stat > mm: enable per numa node rss_stat count > mm, oom: enable per numa node oom for CONSTRAINT_MEMORY_POLICY > > arch/s390/mm/pgtable.c | 4 +- > fs/exec.c | 2 +- > fs/proc/base.c | 6 +- > fs/proc/task_mmu.c | 14 ++-- > include/linux/mm.h | 59 ++++++++++++----- > include/linux/mm_types_task.h | 16 +++++ > include/linux/oom.h | 2 +- > include/trace/events/kmem.h | 27 ++++++-- > kernel/events/uprobes.c | 6 +- > kernel/fork.c | 70 +++++++++++++++++++- > mm/huge_memory.c | 13 ++-- > mm/khugepaged.c | 4 +- > mm/ksm.c | 2 +- > mm/madvise.c | 2 +- > mm/memory.c | 116 ++++++++++++++++++++++++---------- > mm/migrate.c | 2 + > mm/migrate_device.c | 2 +- > mm/oom_kill.c | 59 ++++++++++++----- > mm/rmap.c | 16 ++--- > mm/swapfile.c | 4 +- > mm/userfaultfd.c | 2 +- > 21 files changed, 317 insertions(+), 111 deletions(-) > > -- > 2.20.1 >
Powered by blists - more mailing lists