lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LSU.2.11.1401230203070.1132@eggly.anvils>
Date:	Thu, 23 Jan 2014 02:42:58 -0800 (PST)
From:	Hugh Dickins <hughd@...gle.com>
To:	Michal Hocko <mhocko@...e.cz>
cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Greg Thelen <gthelen@...gle.com>, linux-mm@...ck.org,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH -mm 2/2] memcg: fix css reference leak and endless loop
 in mem_cgroup_iter

On Wed, 22 Jan 2014, Michal Hocko wrote:
> On Tue 21-01-14 13:18:42, Hugh Dickins wrote:
> [...]
> > We do have a confusing situation.  The hang goes back to 3.10 but takes
> > two different forms, because of intervening changes: in 3.10 and 3.11
> > mem_cgroup_iter repeatedly returns root memcg to its caller, in 3.12 and
> > 3.13 mem_cgroup_iter repeatedly gets NULL memcg from mem_cgroup_iter_next
> > and cannot return to its caller.
> > 
> > Patch 1/2 is what's needed to fix 3.10 and 3.11 (and applies correctly
> > to 3.11, but will have to be rediffed for 3.10 because of rearrangement
> > in between). 
> 
> I will backport it when it reaches stable queue.

Thank you.

Testing, at home and at work, has confirmed your two patches are good.
Greg's particular test on 3.11 gave convincing results, and I was helped
by Linus's tree of last night (df32e43a54d0) happening to be quite quick
to reproduce the issue on my laptop.

> 
> > Patch 2/2 is what's needed to fix 3.12 and 3.13 (but applies
> > correctly to neither of them because it's diffed on top of my CSS_ONLINE
> > fix).  Patch 1/2 is correct but unnecessary in 3.12 and 3.13: I'm unclear
> > whether Michal is claiming that it would also fix the hang in 3.12 and
> > 3.13 if we didn't have 2/2: I doubt that, and haven't tested that.
> 
> Actually both patches are needed. If we had only 2/2 then we wouldn't
> endless loop inside mem_cgroup_iter but we could still return root to
> caller all the time because mem_cgroup_iter_load would return NULL on
> css_tryget failure on the cached root. Or am I missing something that
> would prevent that?

In theory I agree with you; and if you're missing something, I can't see
it either.  But in practice, all my earlier testing of 3.12 and 3.13 was
just with 2/2, and I've tried without your 1/2 since: whereas I have hung
on 3.12 and 3.13 a convincing number of times without 2/2, I have never
hung on them with 2/2 without 1/2.  In practice 1/2 appears essential
for 3.10 and 3.11, but irrelevant for 3.12 and 3.13.

That could be easy to explain if there were a difference at the calling
end, shrink_zone(), between those releases: but I don't see that.  Odd.
Either we're both missing something, or my testing is even less reliable
than I'd thought.  But since I certainly don't dispute 1/2, it is merely
academic.  Though still bothersome.

> 
> > Given how Michal has diffed this patch on top of my CSS_ONLINE one
> > (mm-memcg-iteration-skip-memcgs-not-yet-fully-initialized.patch),
> > it would be helpful if you could mark that one also for stable 3.12+,
> > to save us from having to rediff this one for stable.  We don't have
> > a concrete example of a problem it solves in the vanilla kernel, but
> > it makes more sense to include it than to exclude it.
> 
> Yes, I think it makes sense to queue it for 3.12+ as well because it is
> non intrusive and potential issues would be really subtle.

Before Andrew sends these all off to Linus, I should admit that there's
probably a refinement still to come to the CSS_ONLINE one.  I'm ashamed
to admit that I overlooked a much earlier comment from Greg Thelen, who
suggested that a memory barrier might be required.  I think he's right,
and I'd have liked to say exactly what and where before answering you
now; but barriers are tricky elusive things, I've not yet decided,
and better report back to you now on the testing result.

Hugh

> 
> > (You would be right to point out that the CSS_ONLINE one fixes
> > something that goes back to 3.10: I'm saying 3.12+ because I'm not
> > motivated to rediff it for 3.10 and 3.11 when there's nothing to
> > go on top; but that's not a very good reason to lie - overrule me.)
> > 
> > Hugh
> 
> -- 
> Michal Hocko
> SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ