lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 2 Mar 2010 15:15:44 +0900
From:	Daisuke Nishimura <nishimura@....nes.nec.co.jp>
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc:	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"balbir@...ux.vnet.ibm.com" <balbir@...ux.vnet.ibm.com>,
	rientjes@...gle.com,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Daisuke Nishimura <nishimura@....nes.nec.co.jp>
Subject: Re: [BUGFIX][PATCH] memcg: fix oom kill behavior v2

On Tue, 2 Mar 2010 14:56:44 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com> wrote:
> On Tue, 2 Mar 2010 14:37:38 +0900
> Daisuke Nishimura <nishimura@....nes.nec.co.jp> wrote:
> 
> > On Tue, 2 Mar 2010 13:55:24 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com> wrote:
> > > Very sorry, mutex_lock is called after prepare_to_wait.
> > > This is a fixed one.
> > I'm willing to test your patch, but I have one concern.
> > 
> > > +/*
> > > + * try to call OOM killer. returns false if we should exit memory-reclaim loop.
> > > + */
> > > +bool mem_cgroup_handle_oom(struct mem_cgroup *mem, gfp_t mask)
> > >  {
> > > -	mem_cgroup_walk_tree(mem, NULL, record_last_oom_cb);
> > > +	DEFINE_WAIT(wait);
> > > +	bool locked;
> > > +
> > > +	/* At first, try to OOM lock hierarchy under mem.*/
> > > +	mutex_lock(&memcg_oom_mutex);
> > > +	locked = mem_cgroup_oom_lock(mem);
> > > +	if (!locked)
> > > +		prepare_to_wait(&memcg_oom_waitq, &wait, TASK_INTERRUPTIBLE);
> > > +	mutex_unlock(&memcg_oom_mutex);
> > > +
> > > +	if (locked)
> > > +		mem_cgroup_out_of_memory(mem, mask);
> > > +	else {
> > > +		schedule();
> > > +		finish_wait(&memcg_oom_waitq, &wait);
> > > +	}
> > > +	mutex_lock(&memcg_oom_mutex);
> > > +	mem_cgroup_oom_unlock(mem);
> > > +	/* TODO: more fine grained waitq ? */
> > > +	wake_up_all(&memcg_oom_waitq);
> > > +	mutex_unlock(&memcg_oom_mutex);
> > > +
> > > +	if (test_thread_flag(TIF_MEMDIE) || fatal_signal_pending(current))
> > > +		return false;
> > > +	/* Give chance to dying process */
> > > +	schedule_timeout(1);
> > > +	return true;
> > >  }
> > >  
> > Isn't there such race conditions ?
> > 
> > 	context A				context B
> >   mutex_lock(&memcg_oom_mutex)
> >   mem_cgroup_oom_lock()
> >     ->success
> >   mutex_unlock(&memcg_oom_mutex)
> >   mem_cgroup_out_of_memory()
> > 					mutex_lock(&memcg_oom_mutex)
> > 					mem_cgroup_oom_lock()
> > 					  ->fail
> > 					prepare_to_wait()
> > 					mutex_unlock(&memcg_oom_mutex)
> >   mutex_lock(&memcg_oom_mutex)
> >   mem_cgroup_oom_unlock()
> >   wake_up_all()
> >   mutex_unlocklock(&memcg_oom_mutex)
> > 					schedule()
> > 					finish_wait()
> > 
> > In this case, context B will not be waken up, right?
> > 
> 
> No. 
> 	prerape_to_wait();
> 	schedule();
> 	finish_wait();
> call sequence is for this kind of waiting.
> 
> 
> 1. Thread B. call prepare_to_wait(), then, wait is queued and task's status
>    is changed to be TASK_INTERRUPTIBLE
> 2. Thread A. wake_up_all() check all waiters in queue and change their status
>    to be TASK_RUNNING.
> 3. Thread B. calles schedule() but it's status is TASK_RUNNING,
>    it will be scheduled soon, no sleep.
> 
Ah, you're right. I forgot the point 2.
Thank you for your clarification.

I'll test this patch all through this night, and check whether it doesn't trigger
global oom after memcg's oom.


Thanks,
Daisuke Nishimura.


> Then, mutex_lock after prepare_to_wait() is bad ;)
> 
> Thanks,
> -Kame
> 
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ