[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220307091637.820937128@linuxfoundation.org>
Date: Mon, 7 Mar 2022 10:19:04 +0100
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-kernel@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
stable@...r.kernel.org, Ye Bin <yebin10@...wei.com>,
Ming Lei <ming.lei@...hat.com>, Jens Axboe <axboe@...nel.dk>,
Sudip Mukherjee <sudipm.mukherjee@...il.com>
Subject: [PATCH 4.19 29/51] block: Fix fsync always failed if once failed
From: Ye Bin <yebin10@...wei.com>
commit 8a7518931baa8ea023700987f3db31cb0a80610b upstream.
We do test with inject error fault base on v4.19, after test some time we found
sync /dev/sda always failed.
[root@...alhost] sync /dev/sda
sync: error syncing '/dev/sda': Input/output error
scsi log as follows:
[19069.812296] sd 0:0:0:0: [sda] tag#64 Send: scmd 0x00000000d03a0b6b
[19069.812302] sd 0:0:0:0: [sda] tag#64 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00
[19069.812533] sd 0:0:0:0: [sda] tag#64 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[19069.812536] sd 0:0:0:0: [sda] tag#64 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00
[19069.812539] sd 0:0:0:0: [sda] tag#64 scsi host busy 1 failed 0
[19069.812542] sd 0:0:0:0: Notifying upper driver of completion (result 0)
[19069.812546] sd 0:0:0:0: [sda] tag#64 sd_done: completed 0 of 0 bytes
[19069.812549] sd 0:0:0:0: [sda] tag#64 0 sectors total, 0 bytes done.
[19069.812564] print_req_error: I/O error, dev sda, sector 0
ftrace log as follows:
rep-306069 [007] .... 19654.923315: block_bio_queue: 8,0 FWS 0 + 0 [rep]
rep-306069 [007] .... 19654.923333: block_getrq: 8,0 FWS 0 + 0 [rep]
kworker/7:1H-250 [007] .... 19654.923352: block_rq_issue: 8,0 FF 0 () 0 + 0 [kworker/7:1H]
<idle>-0 [007] ..s. 19654.923562: block_rq_complete: 8,0 FF () 18446744073709551615 + 0 [0]
<idle>-0 [007] d.s. 19654.923576: block_rq_complete: 8,0 WS () 0 + 0 [-5]
As 8d6996630c03 introduce 'fq->rq_status', this data only update when 'flush_rq'
reference count isn't zero. If flush request once failed and record error code
in 'fq->rq_status'. If there is no chance to update 'fq->rq_status',then do fsync
will always failed.
To address this issue reset 'fq->rq_status' after return error code to upper layer.
Fixes: 8d6996630c03("block: fix null pointer dereference in blk_mq_rq_timed_out()")
Signed-off-by: Ye Bin <yebin10@...wei.com>
Reviewed-by: Ming Lei <ming.lei@...hat.com>
Link: https://lore.kernel.org/r/20211129012659.1553733-1-yebin10@huawei.com
Signed-off-by: Jens Axboe <axboe@...nel.dk>
[sudip: adjust context]
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@...il.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
---
block/blk-flush.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -239,8 +239,10 @@ static void flush_end_io(struct request
return;
}
- if (fq->rq_status != BLK_STS_OK)
+ if (fq->rq_status != BLK_STS_OK) {
error = fq->rq_status;
+ fq->rq_status = BLK_STS_OK;
+ }
hctx = blk_mq_map_queue(q, flush_rq->mq_ctx->cpu);
if (!q->elevator) {
Powered by blists - more mailing lists