linux-kernel - Re: [PATCH v6 11/29] memcg: allow a memcg with kmem charges to be destructed.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 2 Nov 2012 11:50:32 +0400
From:	Glauber Costa <glommer@...allels.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
CC:	<linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>,
	<kamezawa.hiroyu@...fujitsu.com>,
	Johannes Weiner <hannes@...xchg.org>,
	"Tejun Heo" <tj@...nel.org>, Michal Hocko <mhocko@...e.cz>,
	Christoph Lameter <cl@...ux.com>,
	Pekka Enberg <penberg@...nel.org>,
	David Rientjes <rientjes@...gle.com>,
	Pekka Enberg <penberg@...helsinki.fi>,
	"Suleiman Souhlal" <suleiman@...gle.com>
Subject: Re: [PATCH v6 11/29] memcg: allow a memcg with kmem charges to be
 destructed.

On 11/02/2012 04:05 AM, Andrew Morton wrote:
> On Thu,  1 Nov 2012 16:07:27 +0400
> Glauber Costa <glommer@...allels.com> wrote:
> 
>> Because the ultimate goal of the kmem tracking in memcg is to track slab
>> pages as well, we can't guarantee that we'll always be able to point a
>> page to a particular process, and migrate the charges along with it -
>> since in the common case, a page will contain data belonging to multiple
>> processes.
>>
>> Because of that, when we destroy a memcg, we only make sure the
>> destruction will succeed by discounting the kmem charges from the user
>> charges when we try to empty the cgroup.
> 
> There was a significant conflict with the sched/numa changes in
> linux-next, which I resolved as below.  Please check it.
> 
> static int mem_cgroup_reparent_charges(struct mem_cgroup *memcg)
> {
> 	struct cgroup *cgrp = memcg->css.cgroup;
> 	int node, zid;
> 	u64 usage;
> 
> 	do {
> 		if (cgroup_task_count(cgrp) || !list_empty(&cgrp->children))
> 			return -EBUSY;
> 		/* This is for making all *used* pages to be on LRU. */
> 		lru_add_drain_all();
> 		drain_all_stock_sync(memcg);
> 		mem_cgroup_start_move(memcg);
> 		for_each_node_state(node, N_HIGH_MEMORY) {
> 			for (zid = 0; zid < MAX_NR_ZONES; zid++) {
> 				enum lru_list lru;
> 				for_each_lru(lru) {
> 					mem_cgroup_force_empty_list(memcg,
> 							node, zid, lru);
> 				}
> 			}
> 		}
> 		mem_cgroup_end_move(memcg);
> 		memcg_oom_recover(memcg);
> 		cond_resched();
> 
> 		/*
> 		 * Kernel memory may not necessarily be trackable to a specific
> 		 * process. So they are not migrated, and therefore we can't
> 		 * expect their value to drop to 0 here.
> 		 * Having res filled up with kmem only is enough.
> 		 *
> 		 * This is a safety check because mem_cgroup_force_empty_list
> 		 * could have raced with mem_cgroup_replace_page_cache callers
> 		 * so the lru seemed empty but the page could have been added
> 		 * right after the check. RES_USAGE should be safe as we always
> 		 * charge before adding to the LRU.
> 		 */
> 		usage = res_counter_read_u64(&memcg->res, RES_USAGE) -
> 			res_counter_read_u64(&memcg->kmem, RES_USAGE);
> 	} while (usage > 0);
> 
> 	return 0;
> }
> 
Andrew,

It looks fine.

One thing: Open code reading makes very difficult to spot which exactly
the conflict is. Any chance you could send those in some kind of diff
format?

In this case, it appears to me that the reason for the conflict is that
the loop conditional "while (usage > 0 || ret)" was changed to "while
(usage > 0)". Being this the case, and because kmemcg has no business
with that "ret" value, this resolution is appropriate.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/