[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110926082837.GC10156@tiehlicka.suse.cz>
Date: Mon, 26 Sep 2011 10:28:37 +0200
From: Michal Hocko <mhocko@...e.cz>
To: David Rientjes <rientjes@...gle.com>
Cc: Oleg Nesterov <oleg@...hat.com>,
Konstantin Khlebnikov <khlebnikov@...nvz.org>,
linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
"Rafael J. Wysocki" <rjw@...k.pl>, Tejun Heo <tj@...nel.org>,
Rusty Russell <rusty@...tcorp.com.au>
Subject: [PATCH 1/2] oom: do not live lock on frozen tasks
[Let's add some more people to CC list]
Sorry it took so long but I was quite bussy recently.
On Fri 26-08-11 11:13:40, David Rientjes wrote:
> On Fri, 26 Aug 2011, Michal Hocko wrote:
[...]
> > I am not saying the bonus is necessary, though. It depends on what
> > the freezer is used for (e.g. freeze a process which went wild and
> > debug what went wrong wouldn't welcome that somebody killed it or other
> > (mis)use which relies on D state).
> >
>
> I'd love to be able to do a thaw on a PF_FROZEN task in the oom killer
> followed by a SIGKILL if that task is selected for oom kill without an
> heuristic change. Not sure if that's possible, so we'll wait for Rafael
> to chime in.
We have discussed that with Rafael and it should be safe to do that. See
the patch bellow.
The only place I am not entirely sure about is run_guest
(drivers/lguest/core.c). It seems that the code is able to cope with
signals but it also calls lguest_arch_run_guest after try_to_freeze.
---
>From aea3c3fba1e172373877df03a335848cae4b717e Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@...e.cz>
Date: Fri, 16 Sep 2011 11:23:15 +0200
Subject: [PATCH 1/2] oom: do not live lock on frozen tasks
Konstantin Khlebnikov has reported (https://lkml.org/lkml/2011/8/23/45)
that OOM can end up in a live lock if select_bad_process picks up a frozen
task.
Unfortunately we cannot mark such processes as unkillable to ignore them
because we could panic the system even though there is a chance that
somebody could thaw the process so we can make a forward process (e.g. a
process from another cpuset or with a different nodemask).
Let's thaw an OOM selected frozen process right after we've sent fatal
signal from oom_kill_task.
Thawing is safe if the frozen task doesn't access any suspended device
(e.g. by ioctl) on the way out to the userspace where we handle the
signal and die. Note, we are not interested in the kernel threads because
they are not oom killable.
Accessing suspended devices by a userspace processes shouldn't be an
issue because devices are suspended only after userspace is already
frozen and oom is disabled at that time.
run_guest (drivers/lguest/core.c) calls try_to_freeze with an user
context but it seems it is able to cope with signals because it
explicitly checks for pending signals so we should be safe.
Other than that userspace accesses the fridge only from the
signal handling routines so we are able to handle SIGKILL without any
negative side effects.
Signed-off-by: Michal Hocko <mhocko@...e.cz>
Reported-by: Konstantin Khlebnikov <khlebnikov@...nvz.org>
---
mm/oom_kill.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 626303b..b9774f3 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -32,6 +32,7 @@
#include <linux/mempolicy.h>
#include <linux/security.h>
#include <linux/ptrace.h>
+#include <linux/freezer.h>
int sysctl_panic_on_oom;
int sysctl_oom_kill_allocating_task;
@@ -451,6 +452,9 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem)
task_pid_nr(q), q->comm);
task_unlock(q);
force_sig(SIGKILL, q);
+
+ if (frozen(q))
+ thaw_process(q);
}
set_tsk_thread_flag(p, TIF_MEMDIE);
--
1.7.5.4
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists