lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160229182131.GP16930@dhcp22.suse.cz>
Date:	Mon, 29 Feb 2016 19:21:31 +0100
From:	Michal Hocko <mhocko@...nel.org>
To:	Vladimir Davydov <vdavydov@...tuozzo.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
	David Rientjes <rientjes@...gle.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] exit: clear TIF_MEMDIE after exit_task_work

On Mon 29-02-16 20:02:09, Vladimir Davydov wrote:
> An mm_struct may be pinned by a file. An example is vhost-net device
> created by a qemu/kvm (see vhost_net_ioctl -> vhost_net_set_owner ->
> vhost_dev_set_owner). If such process gets OOM-killed, the reference to
> its mm_struct will only be released from exit_task_work -> ____fput ->
> __fput -> vhost_net_release -> vhost_dev_cleanup, which is called after
> exit_mmap, where TIF_MEMDIE is cleared. As a result, we can start
> selecting the next victim before giving the last one a chance to free
> its memory. In practice, this leads to killing several VMs along with
> the fattest one.

I am wondering why our PF_EXITING protection hasn't fired up. This is
not done in the mmotm tree but I guess you have seen the issue with the
linus tree, right? Do you have a log with oom reports available?

To be honest I do not feel very comfortable about moving the
exit_oom_victim even further down in do_exit path behind even less clear
locking or other dependencies.

Let's see if we can do any better for this particular case. 

> Signed-off-by: Vladimir Davydov <vdavydov@...tuozzo.com>
> ---
>  kernel/exit.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/exit.c b/kernel/exit.c
> index fd90195667e1..cc50e12165f7 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -434,8 +434,6 @@ static void exit_mm(struct task_struct *tsk)
>  	task_unlock(tsk);
>  	mm_update_next_owner(mm);
>  	mmput(mm);
> -	if (test_thread_flag(TIF_MEMDIE))
> -		exit_oom_victim(tsk);
>  }
>  
>  static struct task_struct *find_alive_thread(struct task_struct *p)
> @@ -746,6 +744,8 @@ void do_exit(long code)
>  		disassociate_ctty(1);
>  	exit_task_namespaces(tsk);
>  	exit_task_work(tsk);
> +	if (test_thread_flag(TIF_MEMDIE))
> +		exit_oom_victim(tsk);
>  	exit_thread();
>  
>  	/*
> -- 
> 2.1.4

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ