[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250512011927.2809400-13-yukuai1@huaweicloud.com>
Date: Mon, 12 May 2025 09:19:20 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: hch@....de,
xni@...hat.com,
colyli@...nel.org,
agk@...hat.com,
snitzer@...nel.org,
mpatocka@...hat.com,
song@...nel.org,
yukuai3@...wei.com
Cc: linux-kernel@...r.kernel.org,
dm-devel@...ts.linux.dev,
linux-raid@...r.kernel.org,
yukuai1@...weicloud.com,
yi.zhang@...wei.com,
yangerkun@...wei.com,
johnny.chenyi@...wei.com
Subject: [PATCH RFC md-6.16 v3 12/19] md/md-llbitmap: implement bit state machine
From: Yu Kuai <yukuai3@...wei.com>
Each bit is one byte, contain 6 difference state, see llbitmap_state. And
there are total 8 differenct actions, see llbitmap_action, can change
state:
llbitmap state machine: transitions between states
| | Startwrite | Startsync | Endsync | Abortsync| Reload | Daemon | Discard | Stale |
| --------- | ---------- | --------- | ------- | ------- | -------- | ------ | --------- | --------- |
| Unwritten | Dirty | x | x | x | x | x | x | x |
| Clean | Dirty | x | x | x | x | x | Unwritten | NeedSync |
| Dirty | x | x | x | x | NeedSync | Clean | Unwritten | NeedSync |
| NeedSync | x | Syncing | x | x | x | x | Unwritten | x |
| Syncing | x | Syncing | Dirty | NeedSync | NeedSync | x | Unwritten | NeedSync |
Typical scenarios:
1) Create new array
All bits will be set to Unwritten by default, if --assume-clean is set,
All bits will be set to Clean instead.
2) write data, raid1/raid10 have full copy of data, while raid456 donen't and
rely on xor data
2.1) write new data to raid1/raid10:
Unwritten --StartWrite--> Dirty
2.2) write new data to raid456:
Unwritten --StartWrite--> NeedSync
Because the initial recover for raid456 is skipped, the xor data is not build
yet, the bit must set to NeedSync first and after lazy initial recover is
finished, the bit will finially set to Dirty(see 4.1 and 4.4);
2.3) cover write
Clean --StartWrite--> Dirty
3) daemon, if the array is not degraded:
Dirty --Daemon--> Clean
For degraded array, the Dirty bit will never be cleared, prevent full disk
recovery while readding a removed disk.
4) discard
{Clean, Dirty, NeedSync, Syncing} --Discard--> Unwritten
4) resync and recover
4.1) common process
NeedSync --Startsync--> Syncing --Endsync--> Dirty --Daemon--> Clean
4.2) resync after power failure
Dirty --Reload--> NeedSync
4.3) recover while replacing with a new disk
By default, the old bitmap framework will recover all data, and llbitmap
implement this by a new helper llbitmap_skip_sync_blocks:
skip recover for bits other than dirty or clean;
4.4) lazy initial recover for raid5:
By default, the old bitmap framework will only allow new recover when there
are spares(new disk), a new recovery flag MD_RECOVERY_LAZY_RECOVER is add
to perform raid456 lazy recover for set bits(from 2.2).
Signed-off-by: Yu Kuai <yukuai3@...wei.com>
---
drivers/md/md-llbitmap.c | 100 +++++++++++++++++++++++++++++++++++++++
1 file changed, 100 insertions(+)
diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c
index b27d10661387..315a4eb7627c 100644
--- a/drivers/md/md-llbitmap.c
+++ b/drivers/md/md-llbitmap.c
@@ -462,3 +462,103 @@ static int llbitmap_cache_pages(struct llbitmap *llbitmap)
return 0;
}
+
+static void llbitmap_init_state(struct llbitmap *llbitmap)
+{
+ enum llbitmap_state state = BitUnwritten;
+ unsigned long i;
+
+ if (test_and_clear_bit(BITMAP_CLEAN, &llbitmap->flags))
+ state = BitClean;
+
+ for (i = 0; i < llbitmap->chunks; i++) {
+ int ret = llbitmap_write(llbitmap, state, i);
+
+ if (ret < 0) {
+ set_bit(BITMAP_WRITE_ERROR, &llbitmap->flags);
+ break;
+ }
+ }
+}
+
+/* The return value is only used from resync, where @start == @end. */
+static enum llbitmap_state llbitmap_state_machine(struct llbitmap *llbitmap,
+ unsigned long start,
+ unsigned long end,
+ enum llbitmap_action action)
+{
+ struct mddev *mddev = llbitmap->mddev;
+ enum llbitmap_state state = BitNone;
+ bool need_resync = false;
+ bool need_recovery = false;
+
+ if (test_bit(BITMAP_WRITE_ERROR, &llbitmap->flags))
+ return BitNone;
+
+ if (action == BitmapActionInit) {
+ llbitmap_init_state(llbitmap);
+ return BitNone;
+ }
+
+ while (start <= end) {
+ ssize_t ret;
+ enum llbitmap_state c;
+
+ ret = llbitmap_read(llbitmap, &c, start);
+ if (ret < 0) {
+ set_bit(BITMAP_WRITE_ERROR, &llbitmap->flags);
+ return BitNone;
+ }
+
+ if (c < 0 || c >= nr_llbitmap_state) {
+ pr_err("%s: invalid bit %lu state %d action %d, forcing resync\n",
+ __func__, start, c, action);
+ state = BitNeedSync;
+ goto write_bitmap;
+ }
+
+ if (c == BitNeedSync)
+ need_resync = true;
+
+ state = state_machine[c][action];
+ if (state == BitNone) {
+ start++;
+ continue;
+ }
+
+write_bitmap:
+ /* Delay raid456 initial recovery to first write. */
+ if (c == BitUnwritten && state == BitDirty &&
+ action == BitmapActionStartwrite && is_raid456(mddev)) {
+ state = BitNeedSync;
+ need_recovery = true;
+ }
+
+ ret = llbitmap_write(llbitmap, state, start);
+ if (ret < 0) {
+ set_bit(BITMAP_WRITE_ERROR, &llbitmap->flags);
+ return BitNone;
+ }
+
+ if (state == BitNeedSync)
+ need_resync = true;
+ else if (state == BitDirty &&
+ !timer_pending(&llbitmap->pending_timer))
+ mod_timer(&llbitmap->pending_timer,
+ jiffies + mddev->bitmap_info.daemon_sleep * HZ);
+
+ start++;
+ }
+
+ if (need_recovery) {
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+ set_bit(MD_RECOVERY_LAZY_RECOVER, &mddev->recovery);
+ md_wakeup_thread(mddev->thread);
+ } else if (need_resync) {
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+ set_bit(MD_RECOVERY_SYNC, &mddev->recovery);
+ md_wakeup_thread(mddev->thread);
+ }
+
+ return state;
+}
--
2.39.2
Powered by blists - more mailing lists