lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20130507165018.GA5531@mtj.dyndns.org>
Date:	Tue, 7 May 2013 09:50:18 -0700
From:	Tejun Heo <tj@...nel.org>
To:	axboe@...nel.dk
Cc:	linux-kernel@...r.kernel.org, lizefan@...wei.com,
	containers@...ts.linux-foundation.org, cgroups@...r.kernel.org,
	vgoyal@...hat.com
Subject: [PATCH v2 33/33] blk-throttle: implement proper hierarchy support

With the recent updates, blk-throttle is finally ready for proper
hierarchy support.  Dispatching now honors service_queue->parent_sq
and propagates correctly.  The only thing missing is setting
->parent_sq correctly so that throtl_grp hierarchy matches the cgroup
hierarchy.

This patch updates throtl_pd_init() such that service_queues form the
same hierarchy as the cgroup hierarchy if sane_behavior is enabled.
As this concludes proper hierarchy support for blkcg, the shameful
.broken_hierarchy tag is removed from blkio_subsys.

v2: Updated blkio-controller.txt as suggested by Vivek.

Signed-off-by: Tejun Heo <tj@...nel.org>
Acked-by: Vivek Goyal <vgoyal@...hat.com>
Cc: Li Zefan <lizefan@...wei.com>
---
Updated the doc as suggested.  Will push out once -rc1 drops.

Thanks!

 Documentation/cgroups/blkio-controller.txt |   29 +++++++++++++++--------------
 block/blk-cgroup.c                         |    8 --------
 block/blk-throttle.c                       |   22 +++++++++++++++++++++-
 include/linux/cgroup.h                     |    2 ++
 4 files changed, 38 insertions(+), 23 deletions(-)

--- a/Documentation/cgroups/blkio-controller.txt
+++ b/Documentation/cgroups/blkio-controller.txt
@@ -94,11 +94,13 @@ Throttling/Upper Limit policy
 
 Hierarchical Cgroups
 ====================
-- Currently only CFQ supports hierarchical groups. For throttling,
-  cgroup interface does allow creation of hierarchical cgroups and
-  internally it treats them as flat hierarchy.
 
-  If somebody created a hierarchy like as follows.
+Both CFQ and throttling implement hierarchy support; however,
+throttling's hierarchy support is enabled iff "sane_behavior" is
+enabled from cgroup side, which currently is a development option and
+not publicly available.
+
+If somebody created a hierarchy like as follows.
 
 			root
 			/  \
@@ -106,21 +108,20 @@ Hierarchical Cgroups
 			|
 		     test3
 
-  CFQ will handle the hierarchy correctly but and throttling will
-  practically treat all groups at same level. For details on CFQ
-  hierarchy support, refer to Documentation/block/cfq-iosched.txt.
-  Throttling will treat the hierarchy as if it looks like the
-  following.
+CFQ by default and throttling with "sane_behavior" will handle the
+hierarchy correctly.  For details on CFQ hierarchy support, refer to
+Documentation/block/cfq-iosched.txt.  For throttling, all limits apply
+to the whole subtree while all statistics are local to the IOs
+directly generated by tasks in that cgroup.
+
+Throttling without "sane_behavior" enabled from cgroup side will
+practically treat all groups at same level as if it looks like the
+following.
 
 				pivot
 			     /  /   \  \
 			root  test1 test2  test3
 
-  Nesting cgroups, while allowed, isn't officially supported and blkio
-  genereates warning when cgroups nest. Once throttling implements
-  hierarchy support, hierarchy will be supported and the warning will
-  be removed.
-
 Various user visible config options
 ===================================
 CONFIG_BLK_CGROUP
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -911,14 +911,6 @@ struct cgroup_subsys blkio_subsys = {
 	.subsys_id = blkio_subsys_id,
 	.base_cftypes = blkcg_files,
 	.module = THIS_MODULE,
-
-	/*
-	 * blkio subsystem is utterly broken in terms of hierarchy support.
-	 * It treats all cgroups equally regardless of where they're
-	 * located in the hierarchy - all cgroups are treated as if they're
-	 * right below the root.  Fix it and remove the following.
-	 */
-	.broken_hierarchy = true,
 };
 EXPORT_SYMBOL_GPL(blkio_subsys);
 
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -397,10 +397,30 @@ static void throtl_pd_init(struct blkcg_
 {
 	struct throtl_grp *tg = blkg_to_tg(blkg);
 	struct throtl_data *td = blkg->q->td;
+	struct throtl_service_queue *parent_sq;
 	unsigned long flags;
 	int rw;
 
-	throtl_service_queue_init(&tg->service_queue, &td->service_queue);
+	/*
+	 * If sane_hierarchy is enabled, we switch to properly hierarchical
+	 * behavior where limits on a given throtl_grp are applied to the
+	 * whole subtree rather than just the group itself.  e.g. If 16M
+	 * read_bps limit is set on the root group, the whole system can't
+	 * exceed 16M for the device.
+	 *
+	 * If sane_hierarchy is not enabled, the broken flat hierarchy
+	 * behavior is retained where all throtl_grps are treated as if
+	 * they're all separate root groups right below throtl_data.
+	 * Limits of a group don't interact with limits of other groups
+	 * regardless of the position of the group in the hierarchy.
+	 */
+	parent_sq = &td->service_queue;
+
+	if (cgroup_sane_behavior(blkg->blkcg->css.cgroup) && blkg->parent)
+		parent_sq = &blkg_to_tg(blkg->parent)->service_queue;
+
+	throtl_service_queue_init(&tg->service_queue, parent_sq);
+
 	for (rw = READ; rw <= WRITE; rw++) {
 		throtl_qnode_init(&tg->qnode_on_self[rw], tg);
 		throtl_qnode_init(&tg->qnode_on_parent[rw], tg);
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -272,6 +272,8 @@ enum {
 	 * - memcg: use_hierarchy is on by default and the cgroup file for
 	 *   the flag is not created.
 	 *
+	 * - blkcg: blk-throttle becomes properly hierarchical.
+	 *
 	 * The followings are planned changes.
 	 *
 	 * - release_agent will be disallowed once replacement notification
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ