lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 12 Feb 2013 11:33:39 -0500
From:	Johannes Weiner <hannes@...xchg.org>
To:	Michal Hocko <mhocko@...e.cz>
CC:	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Ying Han <yinghan@...gle.com>, Tejun Heo <htejun@...il.com>,
	Glauber Costa <glommer@...allels.com>,
	Li Zefan <lizefan@...wei.com>
Subject: Re: [PATCH v3 4/7] memcg: remove memcg from the reclaim iterators



Michal Hocko <mhocko@...e.cz> wrote:

>On Tue 12-02-13 10:10:02, Johannes Weiner wrote:
>> On Tue, Feb 12, 2013 at 10:54:19AM +0100, Michal Hocko wrote:
>> > On Mon 11-02-13 17:39:43, Johannes Weiner wrote:
>> > > On Mon, Feb 11, 2013 at 10:27:56PM +0100, Michal Hocko wrote:
>> > > > On Mon 11-02-13 14:58:24, Johannes Weiner wrote:
>> > > > > That way, if the dead count gives the go-ahead, you KNOW that
>the
>> > > > > position cache is valid, because it has been updated first.
>> > > > 
>> > > > OK, you are right. We can live without css_tryget because
>dead_count is
>> > > > either OK which means that css would be alive at least this rcu
>period
>> > > > (and RCU walk would be safe as well) or it is incremented which
>means
>> > > > that we have started css_offline already and then css is dead
>already.
>> > > > So css_tryget can be dropped.
>> > > 
>> > > Not quite :)
>> > > 
>> > > The dead_count check is for completed destructions,
>> > 
>> > Not quite :P. dead_count is incremented in css_offline callback
>which is
>> > called before the cgroup core releases its last reference and
>unlinks
>> > the group from the siblinks. css_tryget would already fail at this
>stage
>> > because CSS_DEACT_BIAS is in place at that time but this doesn't
>break
>> > RCU walk. So I think we are safe even without css_get.
>> 
>> But you drop the RCU lock before you return.
>>
>> dead_count IS incremented for every destruction, but it's not
>reliable
>> for concurrent ones, is what I meant.  Again, if there is a
>dead_count
>> mismatch, your pointer might be dangling, easy case.  However, even
>if
>> there is no mismatch, you could still race with a destruction that
>has
>> marked the object dead, and then frees it once you drop the RCU lock,
>> so you need try_get() to check if the object is dead, or you could
>> return a pointer to freed or soon to be freed memory.
>
>Wait a moment. But what prevents from the following race?
>
>rcu_read_lock()
>						mem_cgroup_css_offline(memcg)
>						root->dead_count++
>iter->last_dead_count = root->dead_count

use the dead count read the first time for comparison, i.e. only one atomic read in that function.  you are right, we would miss to account for that concurrent destruction otherwise.

>iter->last_visited = memcg
>						// final
>						css_put(memcg);
>// last_visited is still valid
>rcu_read_unlock()
>[...]
>// next iteration
>rcu_read_lock()
>iter->last_dead_count == root->dead_count
>// KABOOM
>
>The race window between dead_count++ and css_put is quite big but that
>is not important because that css_put can happen anytime before we
>start
>the next iteration and take rcu_read_lock.

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ