lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150922124303.GA24570@redhat.com>
Date:	Tue, 22 Sep 2015 14:43:03 +0200
From:	Oleg Nesterov <oleg@...hat.com>
To:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:	mhocko@...nel.org, torvalds@...ux-foundation.org,
	kwalker@...hat.com, cl@...ux.com, akpm@...ux-foundation.org,
	rientjes@...gle.com, hannes@...xchg.org, vdavydov@...allels.com,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	skozina@...hat.com
Subject: Re: can't oom-kill zap the victim's memory?

On 09/22, Tetsuo Handa wrote:
>
> I imagined a dedicated kernel thread doing something like shown below.
> (I don't know about mm->mmap management.)
> mm->mmap_zapped corresponds to MMF_MEMDIE.

No, it doesn't, please see below.

> bool has_sigkill_task;
> wait_queue_head_t kick_mm_zapper;

OK, if this kthread is kicked by oom this makes more sense, but still
doesn't look right at least initially.

Let me repeat, I do think we need MMF_MEMDIE or something like it before
we do something more clever. And in fact I think this flag makes sense
regardless.

> static void mm_zapper(void *unused)
> {
> 	struct task_struct *g, *p;
> 	struct mm_struct *mm;
>
> sleep:
> 	wait_event(kick_remover, has_sigkill_task);
> 	has_sigkill_task = false;
> restart:
> 	rcu_read_lock();
> 	for_each_process_thread(g, p) {
> 		if (likely(!fatal_signal_pending(p)))
> 			continue;
> 		task_lock(p);
> 		mm = p->mm;
> 		if (mm && mm->mmap && !mm->mmap_zapped && down_read_trylock(&mm->mmap_sem)) {
                                       ^^^^^^^^^^^^^^^

We do not want mm->mmap_zapped, it can't work. We need mm->needs_zap
set by oom_kill_process() and cleared after zap_page_range().

Because otherwise we can not handle CLONE_VM correctly. Suppose that
an innocent process P does vfork() and the child is killed but not
exited yet. mm_zapper() can find the child, do zap_page_range(), and
surprise its alive parent P which uses the same ->mm.

And if we rely on MMF_MEMDIE or mm->needs_zap or whaveter then
for_each_process_thread() doesn't really make sense. And if we have
a single MMF_MEMDIE process (likely case) then the unconditional
_trylock is suboptimal.

Tetsuo, can't we do something simple which "obviously can't hurt at
least" and then discuss the potential improvements?

And yes, yes, the "Kill all user processes sharing victim->mm" logic
in oom_kill_process() doesn't 100% look right, at least wrt the change
we discuss.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ