lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LSU.2.11.1401211218010.5688@eggly.anvils>
Date:	Tue, 21 Jan 2014 13:18:42 -0800 (PST)
From:	Hugh Dickins <hughd@...gle.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
cc:	Michal Hocko <mhocko@...e.cz>,
	Johannes Weiner <hannes@...xchg.org>,
	Greg Thelen <gthelen@...gle.com>, linux-mm@...ck.org,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH -mm 2/2] memcg: fix css reference leak and endless loop
 in mem_cgroup_iter

On Tue, 21 Jan 2014, Andrew Morton wrote:
> On Tue, 21 Jan 2014 11:45:43 +0100 Michal Hocko <mhocko@...e.cz> wrote:
> 
> > 19f39402864e (memcg: simplify mem_cgroup_iter) has reorganized
> > mem_cgroup_iter code in order to simplify it. A part of that change was
> > dropping an optimization which didn't call css_tryget on the root of
> > the walked tree. The patch however didn't change the css_put part in
> > mem_cgroup_iter which excludes root.
> > This wasn't an issue at the time because __mem_cgroup_iter_next bailed
> > out for root early without taking a reference as cgroup iterators
> > (css_next_descendant_pre) didn't visit root themselves.
> > 
> > Nevertheless cgroup iterators have been reworked to visit root by
> > bd8815a6d802 (cgroup: make css_for_each_descendant() and friends include
> > the origin css in the iteration) when the root bypass have been dropped
> > in __mem_cgroup_iter_next. This means that css_put is not called for
> > root and so css along with mem_cgroup and other cgroup internal object
> > tied by css lifetime are never freed.
> > 
> > Fix the issue by reintroducing root check in __mem_cgroup_iter_next
> > and do not take css reference for it.
> > 
> > This reference counting magic protects us also from another issue, an
> > endless loop reported by Hugh Dickins when reclaim races with root
> > removal and css_tryget called by iterator internally would fail. There
> > would be no other nodes to visit so __mem_cgroup_iter_next would return
> > NULL and mem_cgroup_iter would interpret it as "start looping from root
> > again" and so mem_cgroup_iter would loop forever internally.
> 
> I grabbed these two patches but I will sit on them for a week or so,
> pending review-n-test.

Thank you, yes, I'm about to give them more testing.

> 
> > Cc: stable@...r.kernel.org # mem_leak part 3.12+
> 
> What does this mean?

It's certainly a confusing comment.

I suggest just deleting the "mem_leak part ": Michal isn't referring to
any two parts of the patch itself, but to parts of his commit comment;
but it's still unclear what he's claiming.

We do have a confusing situation.  The hang goes back to 3.10 but takes
two different forms, because of intervening changes: in 3.10 and 3.11
mem_cgroup_iter repeatedly returns root memcg to its caller, in 3.12 and
3.13 mem_cgroup_iter repeatedly gets NULL memcg from mem_cgroup_iter_next
and cannot return to its caller.

Patch 1/2 is what's needed to fix 3.10 and 3.11 (and applies correctly
to 3.11, but will have to be rediffed for 3.10 because of rearrangement
in between).  Patch 2/2 is what's needed to fix 3.12 and 3.13 (but applies
correctly to neither of them because it's diffed on top of my CSS_ONLINE
fix).  Patch 1/2 is correct but unnecessary in 3.12 and 3.13: I'm unclear
whether Michal is claiming that it would also fix the hang in 3.12 and
3.13 if we didn't have 2/2: I doubt that, and haven't tested that.

Given how Michal has diffed this patch on top of my CSS_ONLINE one
(mm-memcg-iteration-skip-memcgs-not-yet-fully-initialized.patch),
it would be helpful if you could mark that one also for stable 3.12+,
to save us from having to rediff this one for stable.  We don't have
a concrete example of a problem it solves in the vanilla kernel, but
it makes more sense to include it than to exclude it.

(You would be right to point out that the CSS_ONLINE one fixes
something that goes back to 3.10: I'm saying 3.12+ because I'm not
motivated to rediff it for 3.10 and 3.11 when there's nothing to
go on top; but that's not a very good reason to lie - overrule me.)

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ