[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 2 Feb 2019 20:06:07 +0900
From: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
To: Michal Hocko <mhocko@...nel.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Johannes Weiner <hannes@...xchg.org>,
David Rientjes <rientjes@...gle.com>, linux-mm@...ck.org,
Yong-Taek Lee <ytk.lee@...sung.com>,
Paul McKenney <paulmck@...ux.vnet.ibm.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] mm, oom: Tolerate processes sharing mm with different
view of oom_score_adj.
On 2019/02/01 18:14, Michal Hocko wrote:
> On Fri 01-02-19 05:59:55, Tetsuo Handa wrote:
>> On 2019/01/31 16:11, Michal Hocko wrote:
>>> This is really ridiculous. I have already nacked the previous version
>>> and provided two ways around. The simplest one is to drop the printk.
>>> The second one is to move oom_score_adj to the mm struct. Could you
>>> explain why do you still push for this?
>>
>> Dropping printk() does not close the race.
>
> But it does remove the source of a long operation from the RCU context.
> If you are not willing to post such a trivial patch I will do so.
>
>> You must propose an alternative patch if you dislike this patch.
>
> I will eventually get there.
>
This is really ridiculous. "eventually" cannot be justified as a reason for
rejecting this patch. I want a patch which can be easily backported _now_ .
If vfork() => __set_oom_adj() => execve() sequence is permitted, someone can
try vfork() => clone() => __set_oom_adj() => execve() sequence. And below
program demonstrates that task->vfork_done based exemption in __set_oom_adj()
is broken. It is not always the task_struct who called vfork() that will call
execve().
----------------------------------------
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <sched.h>
static int thread1(void *unused)
{
char *args[3] = { "/bin/true", "true", NULL };
int fd = open("/proc/self/oom_score_adj", O_WRONLY);
write(fd, "1000", 4);
close(fd);
execve(args[0], args, NULL);
return 0;
}
int main(int argc, char *argv[])
{
printf("PID=%d\n", getpid());
if (vfork() == 0) {
clone(thread1, malloc(8192) + 8192,
CLONE_VM | CLONE_FS | CLONE_FILES, NULL);
sleep(1);
_exit(0);
}
return 0;
}
----------------------------------------
PID=8802
[ 1138.425255] updating oom_score_adj for 8802 (a.out) from 0 to 1000 because it shares mm with 8804 (a.out). Report if this is unexpected.
Current loop to enforce same oom_score_adj is 99%+ ending in vain.
And even your "eventually" will remove this loop.
Powered by blists - more mailing lists