lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150919150316.GB31952@redhat.com>
Date:	Sat, 19 Sep 2015 17:03:16 +0200
From:	Oleg Nesterov <oleg@...hat.com>
To:	Kyle Walker <kwalker@...hat.com>, Christoph Lameter <cl@...ux.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Michal Hocko <mhocko@...nel.org>
Cc:	akpm@...ux-foundation.org, rientjes@...gle.com, hannes@...xchg.org,
	vdavydov@...allels.com, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org,
	Stanislav Kozina <skozina@...hat.com>,
	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Subject: can't oom-kill zap the victim's memory?

On 09/17, Kyle Walker wrote:
>
> Currently, the oom killer will attempt to kill a process that is in
> TASK_UNINTERRUPTIBLE state. For tasks in this state for an exceptional
> period of time, such as processes writing to a frozen filesystem during
> a lengthy backup operation, this can result in a deadlock condition as
> related processes memory access will stall within the page fault
> handler.

And there are other potential reasons for deadlock.

Stupid idea. Can't we help the memory hog to free its memory? This is
orthogonal to other improvements we can do.

Please don't tell me the patch below is ugly, incomplete and suboptimal
in many ways, I know ;) I am not sure it is even correct. Just to explain
what I mean.

Perhaps oom_unmap_func() should only zap the anonymous vmas... and there
are a lot of other details which should be discussed if this can make any
sense.

Oleg.
---

--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -493,6 +493,26 @@ void oom_killer_enable(void)
 	up_write(&oom_sem);
 }
 
+static struct mm_struct *oom_unmap_mm;
+
+static void oom_unmap_func(struct work_struct *work)
+{
+	struct mm_struct *mm = xchg(&oom_unmap_mm, NULL);
+
+	if (!atomic_inc_not_zero(&mm->mm_users))
+		return;
+
+	// If this is not safe we can do use_mm() + unuse_mm()
+	down_read(&mm->mmap_sem);
+	if (mm->mmap)
+		zap_page_range(mm->mmap, 0, TASK_SIZE, NULL);
+	up_read(&mm->mmap_sem);
+
+	mmput(mm);
+	mmdrop(mm);
+}
+static DECLARE_WORK(oom_unmap_work, oom_unmap_func);
+
 #define K(x) ((x) << (PAGE_SHIFT-10))
 /*
  * Must be called while holding a reference to p, which will be released upon
@@ -570,8 +590,8 @@ void oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
 		victim = p;
 	}
 
-	/* mm cannot safely be dereferenced after task_unlock(victim) */
 	mm = victim->mm;
+	atomic_inc(&mm->mm_count);
 	mark_tsk_oom_victim(victim);
 	pr_err("Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-rss:%lukB\n",
 		task_pid_nr(victim), victim->comm, K(victim->mm->total_vm),
@@ -604,6 +624,10 @@ void oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
 	rcu_read_unlock();
 
 	do_send_sig_info(SIGKILL, SEND_SIG_FORCED, victim, true);
+	if (cmpxchg(&oom_unmap_mm, NULL, mm))
+		mmdrop(mm);
+	else
+		queue_work(system_unbound_wq, &oom_unmap_work);
 	put_task_struct(victim);
 }
 #undef K

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ