[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171207154513.4154-63-alexander.levin@verizon.com>
Date: Thu, 7 Dec 2017 15:45:45 +0000
From: alexander.levin@...izon.com
To: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"stable@...r.kernel.org" <stable@...r.kernel.org>
Cc: Anand Jain <anand.jain@...cle.com>,
David Sterba <dsterba@...e.com>, alexander.levin@...izon.com
Subject: [PATCH AUTOSEL for 4.14 063/135] btrfs: fix false EIO for missing
device
From: Anand Jain <anand.jain@...cle.com>
[ Upstream commit 102ed2c5ff932439bbbe74c7bd63e6d5baa9f732 ]
When one of the device is missing, bbio_error() takes care of setting
the error status. And if its only IO that is pending in that stripe, it
fails to check the status of the other IO at %bbio_error before setting
the error %bi_status for the %orig_bio. Fix this by checking if
%bbio->error has exceeded the %bbio->max_errors.
Reproducer as below fdatasync error is seen intermittently.
mount -o degraded /dev/sdc /btrfs
dd status=none if=/dev/zero of=$(mktemp /btrfs/XXX) bs=4096 count=1 conv=fdatasync
dd: fdatasync failed for ‘/btrfs/LSe’: Input/output error
The reason for the intermittences of the problem is because
the following conditions have to be met, which depends on timing:
In btrfs_map_bio()
- the RAID1 the missing device has to be at %dev_nr = 1
In bbio_error()
. before bbio_error() is called the bio of the not-missing
device at %dev_nr = 0 must be completed so that the below
condition is true
if (atomic_dec_and_test(&bbio->stripes_pending)) {
Signed-off-by: Anand Jain <anand.jain@...cle.com>
Reviewed-by: Liu Bo <bo.li.liu@...cle.com>
Signed-off-by: David Sterba <dsterba@...e.com>
Signed-off-by: Sasha Levin <alexander.levin@...izon.com>
---
fs/btrfs/volumes.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b39737568c22..abc57f7b2716 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6144,7 +6144,10 @@ static void bbio_error(struct btrfs_bio *bbio, struct bio *bio, u64 logical)
btrfs_io_bio(bio)->mirror_num = bbio->mirror_num;
bio->bi_iter.bi_sector = logical >> 9;
- bio->bi_status = BLK_STS_IOERR;
+ if (atomic_read(&bbio->error) > bbio->max_errors)
+ bio->bi_status = BLK_STS_IOERR;
+ else
+ bio->bi_status = BLK_STS_OK;
btrfs_end_bbio(bbio, bio);
}
}
--
2.11.0
Powered by blists - more mailing lists