linux-kernel - [PATCH 2/6] memcg: handle limit change

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20080613183015.e2b67415.kamezawa.hiroyu@jp.fujitsu.com>
Date:	Fri, 13 Jun 2008 18:30:15 +0900
From:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc:	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>,
	"menage@...gle.com" <menage@...gle.com>,
	"balbir@...ux.vnet.ibm.com" <balbir@...ux.vnet.ibm.com>,
	"xemul@...nvz.org" <xemul@...nvz.org>,
	"yamamoto@...inux.co.jp" <yamamoto@...inux.co.jp>,
	"nishimura@....nes.nec.co.jp" <nishimura@....nes.nec.co.jp>,
	"lizf@...fujitsu.com" <lizf@...fujitsu.com>
Subject: [PATCH 2/6] memcg: handle limit change

Add callback for resize_limit().

After this patch, memcg's usage will be reduced to new limit.
If it cannot, -EBUSY will be return to write() syscall.

And this patch tries to free all pages at force_empty by reusing
shrink function.

Change log: xxx -> v4
 - cut out from memcg hierarhcy patch set.
 - added retry_count as new arguments.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>

---
 Documentation/controllers/memory.txt |    3 --
 mm/memcontrol.c                      |   47 ++++++++++++++++++++++++++++++++---
 2 files changed, 45 insertions(+), 5 deletions(-)

Index: linux-2.6.26-rc5-mm3/mm/memcontrol.c
===================================================================
--- linux-2.6.26-rc5-mm3.orig/mm/memcontrol.c
+++ linux-2.6.26-rc5-mm3/mm/memcontrol.c
@@ -779,6 +779,44 @@ int mem_cgroup_shrink_usage(struct mm_st
 }
 
 /*
+ * A callback for shrinking limit, Always GFP_KERNEL.
+ */
+int mem_cgroup_shrink_usage_to(struct res_counter *res, unsigned long long val,
+			 int retry_count)
+{
+	struct mem_cgroup *memcg = container_of(res, struct mem_cgroup, res);
+
+	if (retry_count > MEM_CGROUP_RECLAIM_RETRIES)
+		return -EBUSY;
+
+retry:
+	if (res_counter_check_under_val(res, val))
+		return 0;
+
+	cond_resched();
+	if (try_to_free_mem_cgroup_pages(memcg, GFP_KERNEL) == 0)
+		return 0; /* no progress...*/
+
+	goto retry;
+}
+
+/*
+ * Must be called under there is no users on this cgroup.
+ */
+static void memcg_shrink_usage_all(struct mem_cgroup *memcg)
+{
+	int retry_count = 0;
+	int ret = 0;
+
+	while (!ret && !res_counter_check_under_val(&memcg->res, 0)) {
+		ret = mem_cgroup_shrink_usage_to(&memcg->res, 0, retry_count);
+		retry_count++;
+	}
+
+	return;
+}
+
+/*
  * This routine traverse page_cgroup in given list and drop them all.
  * *And* this routine doesn't reclaim page itself, just removes page_cgroup.
  */
@@ -835,9 +873,10 @@ static int mem_cgroup_force_empty(struct
 	 * active_list <-> inactive_list while we don't take a lock.
 	 * So, we have to do loop here until all lists are empty.
 	 */
-	while (mem->res.usage > 0) {
+	while (!res_counter_check_under_val(&mem->res, 0)) {
 		if (atomic_read(&mem->css.cgroup->count) > 0)
 			goto out;
+		memcg_shrink_usage_all(mem);
 		for_each_node_state(node, N_POSSIBLE)
 			for (zid = 0; zid < MAX_NR_ZONES; zid++) {
 				struct mem_cgroup_per_zone *mz;
@@ -1046,13 +1085,15 @@ static void mem_cgroup_free(struct mem_c
 		vfree(mem);
 }
 
+struct res_counter_ops root_ops = {
+	.shrink_usage = mem_cgroup_shrink_usage_to,
+};
 
 static struct cgroup_subsys_state *
 mem_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont)
 {
 	struct mem_cgroup *mem;
 	int node;
-
 	if (unlikely((cont->parent) == NULL)) {
 		mem = &init_mem_cgroup;
 		page_cgroup_cache = KMEM_CACHE(page_cgroup, SLAB_PANIC);
@@ -1062,7 +1103,7 @@ mem_cgroup_create(struct cgroup_subsys *
 			return ERR_PTR(-ENOMEM);
 	}
 
-	res_counter_init(&mem->res);
+	res_counter_init_ops(&mem->res, &root_ops);
 
 	for_each_node_state(node, N_POSSIBLE)
 		if (alloc_mem_cgroup_per_zone_info(mem, node))
Index: linux-2.6.26-rc5-mm3/Documentation/controllers/memory.txt
===================================================================
--- linux-2.6.26-rc5-mm3.orig/Documentation/controllers/memory.txt
+++ linux-2.6.26-rc5-mm3/Documentation/controllers/memory.txt
@@ -242,8 +242,7 @@ rmdir() if there are no tasks.
 1. Add support for accounting huge pages (as a separate controller)
 2. Make per-cgroup scanner reclaim not-shared pages first
 3. Teach controller to account for shared-pages
-4. Start reclamation when the limit is lowered
-5. Start reclamation in the background when the limit is
+4. Start reclamation in the background when the limit is
    not yet hit but the usage is getting closer
 
 Summary

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/