lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 12 Nov 2017 20:07:16 -0800
From:   Shaohua Li <shli@...nel.org>
To:     Tejun Heo <tj@...nel.org>
Cc:     axboe@...nel.dk, linux-kernel@...r.kernel.org, kernel-team@...com,
        lizefan@...wei.com, hannes@...xchg.org, cgroups@...r.kernel.org,
        guro@...com
Subject: Re: [PATCH 7/7] blk-throtl: don't throttle the same IO multiple times

On Sun, Nov 12, 2017 at 02:26:13PM -0800, Tejun Heo wrote:
> BIO_THROTTLED is used to mark already throttled bios so that a bio
> doesn't get throttled multiple times.  The flag gets set when the bio
> starts getting dispatched from blk-throtl and cleared when it leaves
> blk-throtl.
> 
> Unfortunately, this doesn't work when the request_queue decides to
> split or requeue the bio and ends up throttling the same IO multiple
> times.  This condition gets easily triggered and often leads to
> multiple times lower bandwidth limit being enforced than configured.
> 
> Fix it by always setting BIO_THROTTLED for bios recursing to the same
> request_queue and clearing only when a bio leaves the current level.
> 
> Signed-off-by: Tejun Heo <tj@...nel.org>
> ---
>  block/blk-core.c           | 10 +++++++---
>  block/blk-throttle.c       |  8 --------
>  include/linux/blk-cgroup.h | 20 ++++++++++++++++++++
>  3 files changed, 27 insertions(+), 11 deletions(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index ad23b96..f0e3157 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -2216,11 +2216,15 @@ blk_qc_t generic_make_request(struct bio *bio)
>  			 */
>  			bio_list_init(&lower);
>  			bio_list_init(&same);
> -			while ((bio = bio_list_pop(&bio_list_on_stack[0])) != NULL)
> -				if (q == bio->bi_disk->queue)
> +			while ((bio = bio_list_pop(&bio_list_on_stack[0])) != NULL) {
> +				if (q == bio->bi_disk->queue) {
> +					blkcg_bio_repeat_q_level(bio);
>  					bio_list_add(&same, bio);
> -				else
> +				} else {
> +					blkcg_bio_leave_q_level(bio);
>  					bio_list_add(&lower, bio);
> +				}
> +			}

Hi Tejun,

Thanks for looking into this while I was absence. I don't understand how this
works. Assume a bio will be splitted into 2 small bios. In
generic_make_request, we charge the whole bio. 'q->make_request_fn' will
dispatch the first small bio, and call generic_make_request for the second
small bio. Then generic_make_request charge the second small bio and we add the
second small bio to current->bio_list[0] (please check the order). In above
code the patch changed, we pop the second small bio and set BIO_THROTTLED for
it. But this is already too late, because generic_make_request already charged
the second small bio.

Did you look at my original patch
(https://marc.info/?l=linux-block&m=150791825327628&w=2), anything wrong?

Thanks,
Shaohua

>  			/* now assemble so we handle the lowest level first */
>  			bio_list_merge(&bio_list_on_stack[0], &lower);
>  			bio_list_merge(&bio_list_on_stack[0], &same);
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 1e6916b..76579b2 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -2223,14 +2223,6 @@ bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg,
>  out_unlock:
>  	spin_unlock_irq(q->queue_lock);
>  out:
> -	/*
> -	 * As multiple blk-throtls may stack in the same issue path, we
> -	 * don't want bios to leave with the flag set.  Clear the flag if
> -	 * being issued.
> -	 */
> -	if (!throttled)
> -		bio_clear_flag(bio, BIO_THROTTLED);
> -
>  #ifdef CONFIG_BLK_DEV_THROTTLING_LOW
>  	if (throttled || !td->track_bio_latency)
>  		bio->bi_issue_stat.stat |= SKIP_LATENCY;
> diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
> index f2f9691..bed0416 100644
> --- a/include/linux/blk-cgroup.h
> +++ b/include/linux/blk-cgroup.h
> @@ -675,9 +675,29 @@ static inline void blkg_rwstat_add_aux(struct blkg_rwstat *to,
>  #ifdef CONFIG_BLK_DEV_THROTTLING
>  extern bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg,
>  			   struct bio *bio);
> +
> +static inline void blkcg_bio_repeat_q_level(struct bio *bio)
> +{
> +	/*
> +	 * @bio is queued while processing a previous bio which was already
> +	 * throttled.  Don't throttle it again.
> +	 */
> +	bio_set_flag(bio, BIO_THROTTLED);
> +}
> +
> +static inline void blkcg_bio_leave_q_level(struct bio *bio)
> +{
> +	/*
> +	 * @bio may get throttled at multiple q levels, clear THROTTLED
> +	 * when leaving the current one.
> +	 */
> +	bio_clear_flag(bio, BIO_THROTTLED);
> +}
>  #else
>  static inline bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg,
>  				  struct bio *bio) { return false; }
> +static inline void blkcg_bio_repeat_q_level(struct bio *bio) { }
> +static inline void biocg_bio_leave_q_level(struct bio *bio) { }
>  #endif
>  
>  static inline struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
> -- 
> 2.9.5
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ