[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <10e164cc-149f-baf6-de52-0b7d3c9468f6@huaweicloud.com>
Date: Thu, 25 May 2023 22:00:23 +0800
From: Li Nan <linan666@...weicloud.com>
To: Yu Kuai <yukuai1@...weicloud.com>, linan666@...weicloud.com,
song@...nel.org, shli@...com, allenpeng@...ology.com,
alexwu@...ology.com, bingjingc@...ology.com, neilb@...e.de
Cc: linux-raid@...r.kernel.org, linux-kernel@...r.kernel.org,
yi.zhang@...wei.com, houtao1@...wei.com, yangerkun@...wei.com,
"yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH 2/3] md/raid10: fix incorrect done of recovery
在 2023/5/22 21:54, Yu Kuai 写道:
> Hi,
>
> 在 2023/05/22 19:54, linan666@...weicloud.com 写道:
>> From: Li Nan <linan122@...wei.com>
>>
>> Recovery will go to giveup and let chunks_skipped++ in
>> raid10_sync_request() if there are some bad_blocks, and it will return
>> max_sector when chunks_skipped >= geo.raid_disks. Now, recovery fail and
>> data is inconsistent but user think recovery is done, it is wrong.
>>
>> Fix it by set mirror's recovery_disabled and spare device shouln't be
>> added to here.
>>
>> Signed-off-by: Li Nan <linan122@...wei.com>
>> ---
>> drivers/md/raid10.c | 16 +++++++++++++++-
>> 1 file changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>> index e21502c03b45..70cc87c7ee57 100644
>> --- a/drivers/md/raid10.c
>> +++ b/drivers/md/raid10.c
>> @@ -3303,6 +3303,7 @@ static sector_t raid10_sync_request(struct mddev
>> *mddev, sector_t sector_nr,
>> int chunks_skipped = 0;
>> sector_t chunk_mask = conf->geo.chunk_mask;
>> int page_idx = 0;
>> + int error_disk = -1;
>> /*
>> * Allow skipping a full rebuild for incremental assembly
>> @@ -3386,7 +3387,18 @@ static sector_t raid10_sync_request(struct
>> mddev *mddev, sector_t sector_nr,
>> return reshape_request(mddev, sector_nr, skipped);
>> if (chunks_skipped >= conf->geo.raid_disks) {
>> - /* if there has been nothing to do on any drive,
>> + pr_err("md/raid10:%s: %s fail\n", mdname(mddev),
>> + test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ? "resync"
>> : "recovery");
>
> Line exceed 80 columns, and following.
>> + if (error_disk >= 0 && !test_bit(MD_RECOVERY_SYNC,
>> &mddev->recovery)) {
>
> Resync has the same problem, right?
>
Yes. But I have no idea to fix it. md_error disk nor set
recovery_disabled is a good solution. So, just print error message now.
Do you have any ideas?
--
Thanks,
Nan
Powered by blists - more mailing lists