lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20100226151506.c78b4312.kamezawa.hiroyu@jp.fujitsu.com>
Date:	Fri, 26 Feb 2010 15:15:06 +0900
From:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To:	Minchan Kim <minchan.kim@...il.com>
Cc:	Andrea Righi <arighi@...eler.com>, Vivek Goyal <vgoyal@...hat.com>,
	David Rientjes <rientjes@...gle.com>,
	Balbir Singh <balbir@...ux.vnet.ibm.com>,
	Suleiman Souhlal <suleiman@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	containers@...ts.linux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] memcg: dirty pages accounting and limiting 
 infrastructure

On Fri, 26 Feb 2010 14:53:39 +0900
Minchan Kim <minchan.kim@...il.com> wrote:

> On Fri, Feb 26, 2010 at 2:01 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@...fujitsu.com> wrote:
> > Hi,
> >
> > On Fri, 26 Feb 2010 13:50:04 +0900
> > Minchan Kim <minchan.kim@...il.com> wrote:
> >
> >> > Hm ? I don't read the whole thread but can_attach() is called under
> >> > cgroup_mutex(). So, it doesn't need to use RCU.
> >>
> >> Vivek mentioned memcg is protected by RCU if I understand his intention right.
> >> So I commented that without enough knowledge of memcg.
> >> After your comment, I dive into the code.
> >>
> >> Just out of curiosity.
> >>
> >> Really, memcg is protected by RCU?
> > yes. All cgroup subsystem is protected by RCU.
> >
> >> I think most of RCU around memcg is for protecting task_struct and
> >> cgroup_subsys_state.
> >> The memcg is protected by cgroup_mutex as you mentioned.
> >> Am I missing something?
> >
> > There are several levels of protections.
> >
> > cgroup subsystem's ->destroy() call back is finally called by
> >
> > As this.
> >
> >  768                 synchronize_rcu();
> >  769
> >  770                 mutex_lock(&cgroup_mutex);
> >  771                 /*
> >  772                  * Release the subsystem state objects.
> >  773                  */
> >  774                 for_each_subsys(cgrp->root, ss)
> >  775                         ss->destroy(ss, cgrp);
> >  776
> >  777                 cgrp->root->number_of_cgroups--;
> >  778                 mutex_unlock(&cgroup_mutex);
> >
> > Before here,
> >        - there are no tasks under this cgroup (cgroup's refcnt is 0)
> >          && cgroup is marked as REMOVED.
> >
> > Then, this access
> >        rcu_read_lock();
> >        mem = mem_cgroup_from_task(task);
> >        if (css_tryget(mem->css))   <===============checks cgroup refcnt
> 
> If it it, do we always need css_tryget after mem_cgroup_from_task
> without cgroup_mutex to make sure css is vaild?
> 
On a case by cases. 

> But I found several cases that don't call css_tryget
> 
> 1. mm_match_cgroup
> It's used by page_referenced_xxx. so we I think we don't grab
> cgroup_mutex at that time.
> 
yes. but all check are done under RCU. And this function never
access contents of memory which pointer holds.
And, please conider the whole context.

	mem_cgroup_try_charge()
		mem_cout =..... (refcnt +1)
		....
		try_to_free_mem_cgroup_pages(mem_cont)
		....
		shrink_xxx_list()
		....
			page_referenced_anon(page, mem_cont)
				mm_match_cgroup(mm, mem_cont)

Then, this mem_cont (2nd argument to mm_match_cgroup) is always valid.
	rcu_read_lock();
	memcg = mem_cgroup_from_task(rcu_dereference(mm->ownder));
	rcu_read_unlock();
	return memcg != mem_cont;

Here,
	1. mem_cont is never reused. (because refcnt+1)
	2. we don't access memcg's contents.

I think even rcu_read_lock()/unlock() is unnecessary.



> 2. mem_cgroup_oom_called
> I think in here we don't grab cgroup_mutex, too.
> 
In OOM-killer context, memcg which causes OOM has refcnt +1.
Then, not necessary. 


> I guess some design would cover that problems.
Maybe.

> Could you tell me if you don't mind?
> Sorry for bothering you.
> 

In my point of view, the most terrible porblem is heavy cost of
css_tryget() when you run multi-thread heavy program.
So, I want to see some inovation, but haven't find yet.

I admit this RCU+refcnt is tend to be hard to review. But it's costly
operation to take refcnt and it's worth to be handled in the best
usage of each logics, on a case by cases for now.

Thanks,
-Kame


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ