linux-kernel - Re: [PATCH] mm/oom_kill.c: don't kill TASK

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.10.1509221631040.7794@chino.kir.corp.google.com>
Date:	Tue, 22 Sep 2015 16:32:38 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
cc:	mhocko@...nel.org, cl@...ux.com, oleg@...hat.com,
	kwalker@...hat.com, akpm@...ux-foundation.org, hannes@...xchg.org,
	vdavydov@...allels.com, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, skozina@...hat.com
Subject: Re: [PATCH] mm/oom_kill.c: don't kill TASK_UNINTERRUPTIBLE tasks

On Tue, 22 Sep 2015, Tetsuo Handa wrote:

> David Rientjes wrote:
> > Your proposal, which I mostly agree with, tries to kill additional 
> > processes so that they allocate and drop the lock that the original victim 
> > depends on.  My approach, from 
> > http://marc.info/?l=linux-kernel&m=144010444913702, is the same, but 
> > without the killing.  It's unecessary to kill every process on the system 
> > that is depending on the same lock, and we can't know which processes are 
> > stalling on that lock and which are not.
> 
> Would you try your approach with below program?
> (My reproducers are tested on XFS on a VM with 4 CPUs / 2048MB RAM.)
> 
> ---------- oom-depleter3.c start ----------
> #define _GNU_SOURCE
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <sched.h>
> 
> static int zero_fd = EOF;
> static char *buf = NULL;
> static unsigned long size = 0;
> 
> static int dummy(void *unused)
> {
> 	static char buffer[4096] = { };
> 	int fd = open("/tmp/file", O_WRONLY | O_CREAT | O_APPEND, 0600);
> 	while (write(fd, buffer, sizeof(buffer) == sizeof(buffer)) &&
> 	       fsync(fd) == 0);
> 	return 0;
> }
> 
> static int trigger(void *unused)
> {
> 	read(zero_fd, buf, size); /* Will cause OOM due to overcommit */
> 	return 0;
> }
> 
> int main(int argc, char *argv[])
> {
>         unsigned long i;
> 	zero_fd = open("/dev/zero", O_RDONLY);
> 	for (size = 1048576; size < 512UL * (1 << 30); size <<= 1) {
> 		char *cp = realloc(buf, size);
> 		if (!cp) {
> 			size >>= 1;
> 			break;
> 		}
> 		buf = cp;
> 	}
> 	/*
> 	 * Create many child threads in order to enlarge time lag between
> 	 * the OOM killer sets TIF_MEMDIE to thread group leader and
> 	 * the OOM killer sends SIGKILL to that thread.
> 	 */
> 	for (i = 0; i < 1000; i++) {
> 		clone(dummy, malloc(1024) + 1024, CLONE_SIGHAND | CLONE_VM,
> 		      NULL);
> 	}
> 	/* Let a child thread trigger the OOM killer. */
> 	clone(trigger, malloc(4096)+ 4096, CLONE_SIGHAND | CLONE_VM, NULL);
> 	/* Deplete all memory reserve using the time lag. */
> 	for (i = size; i; i -= 4096)
> 		buf[i - 1] = 1;
> 	return * (char *) NULL; /* Kill all threads. */
> }
> ---------- oom-depleter3.c end ----------
> 
> uptime > 350 of http://I-love.SAKURA.ne.jp/tmp/serial-20150922-1.txt.xz
> shows that the memory reserves completely depleted and
> uptime > 42 of http://I-love.SAKURA.ne.jp/tmp/serial-20150922-2.txt.xz
> shows that the memory reserves was not used at all.
> Is this result what you expected?
> 

What are the results when the kernel isn't patched at all?  The trade-off 
being made is that we want to attempt to make forward progress when there 
is an excessive stall in an oom victim making its exit rather than 
livelock the system forever waiting for memory that can never be 
allocated.

I struggle to understand how the approach of randomly continuing to kill 
more and more processes in the hope that it slows down usage of memory 
reserves or that we get lucky is better.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/