lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <48AA684B.7000704@cn.fujitsu.com>
Date:	Tue, 19 Aug 2008 14:29:31 +0800
From:	Lai Jiangshan <laijs@...fujitsu.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
CC:	Paul Menage <menage@...gle.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: [PATCH] cgroups: fix probable race with put_css_set[_taskexit] and
 find_css_set

put_css_set_taskexit may be called when find_css_set is called on
other cpu. And the race will occur:

put_css_set_taskexit side                    find_css_set side

                                        |
atomic_dec_and_test(&kref->refcount)    |
    /* kref->refcount = 0 */            |
....................................................................
                                        |  read_lock(&css_set_lock)
                                        |  find_existing_css_set
                                        |  get_css_set
                                        |  read_unlock(&css_set_lock);
....................................................................
__release_css_set                       |
....................................................................
                                        | /* use a released css_set */
                                        |


[put_css_set is the same. But in the current code, all put_css_set are
put into cgroup mutex critical region as the same as find_css_set.]


Signed-off-by: Lai Jiangshan <laijs@...fujitsu.com>
---
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 13932ab..003912e 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -241,7 +241,6 @@ static void unlink_css_set(struct css_set *cg)
 	struct cg_cgroup_link *link;
 	struct cg_cgroup_link *saved_link;
 
-	write_lock(&css_set_lock);
 	hlist_del(&cg->hlist);
 	css_set_count--;
 
@@ -251,8 +250,6 @@ static void unlink_css_set(struct css_set *cg)
 		list_del(&link->cgrp_link_list);
 		kfree(link);
 	}
-
-	write_unlock(&css_set_lock);
 }
 
 static void __release_css_set(struct kref *k, int taskexit)
@@ -260,7 +257,13 @@ static void __release_css_set(struct kref *k, int taskexit)
 	int i;
 	struct css_set *cg = container_of(k, struct css_set, ref);
 
+	write_lock(&css_set_lock);
+	if (atomic_read(&k->refcount) > 0) {
+		write_unlock(&css_set_lock);
+		return;
+	}
 	unlink_css_set(cg);
+	write_unlock(&css_set_lock);
 
 	rcu_read_lock();
 	for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
@@ -410,6 +413,20 @@ static struct css_set *find_css_set(
 	 * the desired set */
 	read_lock(&css_set_lock);
 	res = find_existing_css_set(oldcg, cgrp, template);
+	/*
+	 * put_css_set[_taskexit]() may race with find_css_set(), in that
+	 * find_css_set() got the css_set after put_css_set() had released it.
+	 *
+	 * We should put the whole put_css_set[_taskexit]() into css_set_lock's
+	 * write_lock critical setion to avoid this race. But it will increase
+	 * overhead for do_exit().
+	 *
+	 * So we do not avoid this race but put it under control:
+	 * __release_css_set() will re-check the refcount
+	 * with css_set_lock held.
+	 *
+	 * This race may trigger the warnning in kref_get().
+	 */
 	if (res)
 		get_css_set(res);
 	read_unlock(&css_set_lock);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ