lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130506173306.GC11731@redhat.com>
Date:	Mon, 6 May 2013 13:33:06 -0400
From:	Vivek Goyal <vgoyal@...hat.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Jens Axboe <axboe@...nel.dk>, lkml <linux-kernel@...r.kernel.org>,
	Li Zefan <lizefan@...wei.com>,
	containers@...ts.linux-foundation.org,
	Cgroups <cgroups@...r.kernel.org>
Subject: Re: [PATCHSET] blk-throttle: implement proper hierarchy support

On Fri, May 03, 2013 at 04:54:55PM -0700, Tejun Heo wrote:
> Hey,
> 
> On Fri, May 03, 2013 at 05:05:13PM -0400, Vivek Goyal wrote:
> > Following inline patch implements transferring child's start time to 
> > parent, if parent slice had expired at the time of bio migration.
> > 
> > I does seem to help a lot on my machine. Can you please give it a try.
> 
> Cool, will give it a try.  Can you please make it a proper patch with
> SOB?  Please feel free to base it on top of the series.  It can
> probably go right below the final patch but the rebase there should be
> trivial.

Hi tejun,

Here is the patch to fix the issue. It is on top of your old series
(no throtl_node stuff). Do let me know if want me to refresh it
on top of throtl_node patch.

Thanks
Vivek


blk-throtl: Account for child group's start time in parent while bio climbs up

Now with hierarchy support, a bio climbs up the tree before actually
being dispatched. This makes sure bio is also subjected to parent's
throttling limits, if any.

It might happen that parent is idle and when bio is transferred to
parent, a new slice starts fresh. But that is not good as parents
wait time should have started when bio was queued in child group.

Given the fact that we have not written hierarchical algorithm 
in a way where child's and parents time slices are synchronized,
we transfer the child's start time to parent if parent was idling.
If parent was busy doing dispatch of other bios all this while, this
is not an issue.

Child's slice start time is passed to parent. Parent looks at its
last expired slice start time. If child's start time is after parents
old start time, that means parent had been idle and after parent
went idle, child had an IO queued. So use child's start time as
parent start time.

If parent's start time is after child's start time, that means,
when IO got queued in child group, parent was not idle. But later
it dispatched some IO, its slice got trimmed and then it went idle.
After a while child's request got shifted in parent group. In this
case use parent's old start time as new start time as that's the
duration of slice we did not use.

This logic is far from perfect as if there are multiple childs
then first child transferring the bio decides the start time while
a bio might have queued up even earlier in other child, which is
yet to be transferred up to parent. In that case we will lose
time and bandwidth in parent. This patch is just an approximation
to make situation somewhat better.
 
Signed-off-by: Vivek Goyal <vgoyal@...hat.com>
---
 block/blk-throttle.c |   33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

Index: linux-2.6/block/blk-throttle.c
===================================================================
--- linux-2.6.orig/block/blk-throttle.c	2013-05-06 12:59:20.585535099 -0400
+++ linux-2.6/block/blk-throttle.c	2013-05-06 13:03:37.857552427 -0400
@@ -547,6 +547,28 @@ static bool throtl_schedule_next_dispatc
 	return false;
 }
 
+static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg,
+		bool rw, unsigned long start)
+{
+	tg->bytes_disp[rw] = 0;
+	tg->io_disp[rw] = 0;
+
+	/*
+	 * Previous slice has expired. We must have trimmed it after last
+	 * bio dispatch. That means since start of last slice, we never used
+	 * that bandwidth. Do try to make use of that bandwidth while giving
+	 * credit.
+	 */
+	if (time_after_eq(start, tg->slice_start[rw]))
+		tg->slice_start[rw] = start;
+
+	tg->slice_end[rw] = jiffies + throtl_slice;
+	throtl_log(&tg->service_queue,
+		   "[%c] new slice with credit start=%lu end=%lu jiffies=%lu",
+		   rw == READ ? 'R' : 'W', tg->slice_start[rw],
+		   tg->slice_end[rw], jiffies);
+}
+
 static inline void throtl_start_new_slice(struct throtl_grp *tg, bool rw)
 {
 	tg->bytes_disp[rw] = 0;
@@ -888,6 +910,16 @@ static void tg_update_disptime(struct th
 	tg->flags &= ~THROTL_TG_WAS_EMPTY;
 }
 
+static void start_parent_slice_with_credit(struct throtl_grp *child_tg,
+					struct throtl_grp *parent_tg, bool rw)
+{
+	if (throtl_slice_used(parent_tg, rw)) {
+		throtl_start_new_slice_with_credit(parent_tg, rw,
+				child_tg->slice_start[rw]);
+	}
+
+}
+
 static void tg_dispatch_one_bio(struct throtl_grp *tg, bool rw)
 {
 	struct throtl_service_queue *sq = &tg->service_queue;
@@ -909,6 +941,7 @@ static void tg_dispatch_one_bio(struct t
 	 */
 	if (parent_tg) {
 		throtl_add_bio_tg(bio, parent_tg);
+		start_parent_slice_with_credit(tg, parent_tg, rw);
 	} else {
 		bio_list_add(&parent_sq->bio_lists[rw], bio);
 		BUG_ON(tg->td->nr_queued[rw] <= 0);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ