lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160121203332.GD8379@redhat.com>
Date:	Thu, 21 Jan 2016 15:33:32 -0500
From:	Vivek Goyal <vgoyal@...hat.com>
To:	Shaohua Li <shli@...com>
Cc:	linux-kernel@...r.kernel.org, axboe@...nel.dk, tj@...nel.org,
	jmoyer@...hat.com, Kernel-team@...com
Subject: Re: [RFC 2/3] blk-throttling: weight based throttling

On Wed, Jan 20, 2016 at 09:49:18AM -0800, Shaohua Li wrote:
> We know total bandwidth of a disk and can calculate cgroup's bandwidth
> percentage against disk bandwidth according to its weight. We can easily
> calculate cgroup bandwidth.
> 
> Signed-off-by: Shaohua Li <shli@...com>
> ---
>  block/blk-throttle.c | 135 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 134 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 2149a1d..b3f847d 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -12,6 +12,9 @@
>  #include <linux/blk-cgroup.h>
>  #include "blk.h"
>  
> +#define MAX_WEIGHT (1000)
> +#define WEIGHT_RATIO_SHIFT (12)
> +#define WEIGHT_RATIO (1 << WEIGHT_RATIO_SHIFT)
>  /* Max dispatch from a group in 1 round */
>  static int throtl_grp_quantum = 8;
>  
> @@ -74,6 +77,10 @@ struct throtl_service_queue {
>  	unsigned int		nr_pending;	/* # queued in the tree */
>  	unsigned long		first_pending_disptime;	/* disptime of the first tg */
>  	struct timer_list	pending_timer;	/* fires on first_pending_disptime */
> +
> +	unsigned int		weight;
> +	unsigned int		children_weight;
> +	unsigned int		ratio;

Will it be better to call it "share" instead of "ratio". It is basically
a measure of % disk share of the group and share seems more intuitive.


[..]
> +static void tg_update_bps(struct throtl_grp *tg)
> +{
> +	struct throtl_service_queue *sq, *parent_sq;
> +
> +	sq = &tg->service_queue;
> +	parent_sq = sq->parent_sq;
> +
> +	if (!tg->td->weight_based || !parent_sq)
> +		return;
> +	sq->ratio = max_t(unsigned int,
> +		parent_sq->ratio * sq->weight / parent_sq->children_weight,
> +		1);
> +

It might be good to decouple updation of "share/ratio" and updation of
bps. Change of share can happen any time either weight is changed or
an active group is queue/dequeued and we don't have to do it every time
a bio is submitted.

> +	tg->bps[READ] = max_t(uint64_t,
> +		(queue_bandwidth(tg->td, READ) * sq->ratio) >>
> +			WEIGHT_RATIO_SHIFT,
> +		1024);
> +	tg->bps[WRITE] = max_t(uint64_t,
> +		(queue_bandwidth(tg->td, WRITE) * sq->ratio) >>
> +			WEIGHT_RATIO_SHIFT,
> +		1024);
> +}
> +
> +static void tg_update_ratio(struct throtl_grp *tg)
> +{
> +	struct throtl_data *td = tg->td;
> +	struct cgroup_subsys_state *pos_css;
> +	struct blkcg_gq *blkg;
> +
> +	blkg_for_each_descendant_pre(blkg, pos_css, td->queue->root_blkg) {

Is it possible to traverse only the affected subtree instead of whole
tree of groups. Because if weight is updated on a group, then we just
need to traverse the subtree under that group's parent.


[..]
> @@ -1415,6 +1546,7 @@ bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg,
>  
>  	sq = &tg->service_queue;
>  
> +	tg_update_bps(tg);

Updating bps for every bio submitted sounds like a lot. We probably could
do it when first bio gets queued in the group and then refresh it at
some regular interval. Say when next set of dispatch happens from group
we could update bandwidth of group after dispatch.

Thanks
Vivek

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ