lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Mon, 08 Dec 2014 18:59:08 +0100
From:	Bart Van Assche <bvanassche@....org>
To:	Jens Axboe <axboe@...nel.dk>
CC:	Christoph Hellwig <hch@....de>, Robert Elliott <Elliott@...com>,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] blk-mq: Avoid that I/O hangs in bt_get()

On 12/08/14 17:49, Jens Axboe wrote:
> On 12/08/2014 07:55 AM, Bart Van Assche wrote:
>> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
>> index 67ab88b..e88af88 100644
>> --- a/block/blk-mq-tag.c
>> +++ b/block/blk-mq-tag.c
>> @@ -256,6 +256,8 @@ static int bt_get(struct blk_mq_alloc_data *data,
>>               break;
>>           }
>>
>> +        blk_mq_run_hw_queue(hctx, false);
>> +
>>           blk_mq_put_ctx(data->ctx);
>>
>>           io_schedule();
> 
> Ah yes, that could be an issue for some cases, we do need to run the 
> queue there. For a tag map shared across hardware queues, we might need 
> to run more than just the current queue, however. For now we can safely 
> assume that we allocate fairly, so it should not be an issue.
> 
> It might be worth experimenting with doing a __bt_get() after the queue 
> run before going to sleep, in case the queue run found completions as well.

Unless anyone objects I will start testing the following patch:

[PATCH] blk-mq: Fix bt_get() hang

Avoid that if there are fewer hardware queues than CPU threads that
bt_get() can hang. The symptoms of the hang were as follows:
* All tags allocated for a particular hardware queue.
* (nr_tags) pending commands for that hardware queue.
* No pending commands for the software queues associated with that
  hardware queue.

The call stack that corresponds to the hang is as follows:

io_schedule+0x9c/0x130
bt_get+0xef/0x180
blk_mq_get_tag+0x9f/0xd0
__blk_mq_alloc_request+0x16/0x1f0
blk_mq_map_request+0x123/0x130
blk_mq_make_request+0x69/0x280
generic_make_request+0xc0/0x110
submit_bio+0x64/0x130
do_blockdev_direct_IO+0x1dc8/0x2da0
__blockdev_direct_IO+0x47/0x50
blkdev_direct_IO+0x49/0x50
generic_file_read_iter+0x546/0x610
blkdev_read_iter+0x32/0x40
aio_run_iocb+0x1f8/0x400
do_io_submit+0x121/0x490
SyS_io_submit+0xb/0x10
system_call_fastpath+0x12/0x17
---
 block/blk-mq-tag.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index c22491e..14ab120 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -256,6 +256,12 @@ static int bt_get(struct blk_mq_alloc_data *data,
 		if (tag != -1)
 			break;
 
+		blk_mq_run_hw_queue(hctx, false);
+
+		tag = __bt_get(hctx, bt, last_tag);
+		if (tag != -1)
+			break;
+
 		blk_mq_put_ctx(data->ctx);
 
 		io_schedule();
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ