linux-kernel - Re: [RFC PATCH] blk-mq: fix potential I/O hang caused by batch wakeup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <24d4d60f-05f3-472b-8dfc-4edcb5f7883c@acm.org>
Date: Mon, 20 May 2024 11:11:52 -0700
From: Bart Van Assche <bvanassche@....org>
To: Yang Yang <yang.yang@...o.com>, Jens Axboe <axboe@...nel.dk>,
 linux-block@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] blk-mq: fix potential I/O hang caused by batch wakeup

On 5/19/24 20:38, Yang Yang wrote:
> The depth is 62, and the wake_batch is 8. In the following situation,
> the task would hang forever.
> 
>    t1:                 t2:                          t3:
>    blk_mq_get_tag      .                            .
>    io_schedule         .                            .
>                        elevator_switch              .
>                        blk_mq_freeze_queue          .
>                        blk_freeze_queue_start       .
>                        blk_mq_freeze_queue_wait     .
>                                                     blk_mq_submit_bio
>                                                     __bio_queue_enter
> 
> Fix this issue by waking up all the waiters sleeping on tags after
> freezing the queue.

Shouldn't blk_mq_alloc_request() be mentioned in t1 since that is the function
that calls blk_queue_enter()?

> diff --git a/block/blk-core.c b/block/blk-core.c
> index a16b5abdbbf5..e1eacfad6e5b 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -298,8 +298,6 @@ void blk_queue_start_drain(struct request_queue *q)
>   	 * prevent I/O from crossing blk_queue_enter().
>   	 */
>   	blk_freeze_queue_start(q);
> -	if (queue_is_mq(q))
> -		blk_mq_wake_waiters(q);
>   	/* Make blk_queue_enter() reexamine the DYING flag. */
>   	wake_up_all(&q->mq_freeze_wq);
>   }

Why has blk_queue_start_drain() been modified? I don't see any reference
in the patch description to blk_queue_start_drain(). Am I perhaps missing
something?

> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 4ecb9db62337..9eb3139e713a 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -125,8 +125,10 @@ void blk_freeze_queue_start(struct request_queue *q)
>   	if (++q->mq_freeze_depth == 1) {
>   		percpu_ref_kill(&q->q_usage_counter);
>   		mutex_unlock(&q->mq_freeze_lock);
> -		if (queue_is_mq(q))
> +		if (queue_is_mq(q)) {
> +			blk_mq_wake_waiters(q);
>   			blk_mq_run_hw_queues(q, false);
> +		}
>   	} else {
>   		mutex_unlock(&q->mq_freeze_lock);
>   	}

Why would the above change be necessary? If the blk_queue_enter() call
by blk_mq_alloc_request() succeeds and blk_mq_get_tag() calls
io_schedule(), io_schedule() will be woken up indirectly by the
blk_mq_run_hw_queues() call because that call will free one of the tags
that the io_schedule() call is waiting for.

Thanks,

Bart.