linux-kernel - Re: [RFC 2/5] memcg: rework mem_cgroup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20121115161504.GF11990@dhcp22.suse.cz>
Date:	Thu, 15 Nov 2012 17:15:04 +0100
From:	Michal Hocko <mhocko@...e.cz>
To:	Tejun Heo <htejun@...il.com>
Cc:	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Ying Han <yinghan@...gle.com>,
	Glauber Costa <glommer@...allels.com>
Subject: Re: [RFC 2/5] memcg: rework mem_cgroup_iter to use cgroup iterators

On Thu 15-11-12 07:31:24, Tejun Heo wrote:
> Hello, Michal.
> 
> On Thu, Nov 15, 2012 at 04:12:55PM +0100, Michal Hocko wrote:
> > > Because I'd like to consider the next functions as implementation
> > > detail, and having interations structred as loops tend to read better
> > > and less error-prone.  e.g. when you use next functions directly, it's
> > > way easier to circumvent locking requirements in a way which isn't
> > > very obvious. 
> > 
> > The whole point behind mem_cgroup_iter is to hide all the complexity
> > behind memcg iteration. Memcg code either use for_each_mem_cgroup_tree
> > for !reclaim case and mem_cgroup_iter otherwise.
> > 
> > > So, unless it messes up the code too much (and I can't see why it
> > > would), I'd much prefer if memcg used for_each_*() macros.
> > 
> > As I said this would mean that the current mem_cgroup_iter code would
> > have to be inverted which doesn't simplify the code much. I'd rather
> > hide all the grossy details inside the memcg iterator.
> > Or am I still missing your suggestion?
> 
> One way or the other, I don't think the code complexity would change
> much.  Again, I'd much *prefer* if memcg used what other controllers
> would be using, but that's a preference and if necessary we can keep
> the next functions as exposed APIs. 

Yes please.

> I think the issue I have is that I can't see much technical
> justification for that. If the code becomes much simpler by choosing
> one over the other, sure, but is that the case here?

Yes and I've tried to say that already. Memcg needs hierarchy, css
ref counting and concurrent reclaim (per-zone per-priority) aware
iteration. All of that is hidden in mem_cgroup_iter currently so the
caller doesn't have to care about it at all. Which makes shrink_zone
not care about memcg that much.

cgroup_for_each_descendant_pre is not suitable at least because it
doesn't provide a way to start a walk at a selected node (which is
shared per-zone per-priority in memcg case).
Even if cgroup_for_each_descendant_pre had start parameter there
is still a lot of house keeping that callers would have to handle
(css_tryget to start with, update of the cached possible not mentioning
use_hierarchy thingy or mem_cgroup_disabled).
We also try to not pollute mm/vmscan.c as much as possible so we
definitely do not want to bring all this into shrink_zone.

This all sounds like too much of a hassle if it is exposed so I would
really like to stay with mem_cgroup_iter and slowly simplify it until it
can go away (if that is possible at all).

> Isn't it mostly just about where to put the same things?

Unfortunately no. We wouldn't grow own iterator in such a case.

> If so, what would be the rationale for requiring a different
> interface?

Does the above explain it?

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/