[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <93e96f14-dfe3-6390-5a91-f28e1cdb1783@huaweicloud.com>
Date: Thu, 28 Aug 2025 19:24:48 +0800
From: Li Nan <linan666@...weicloud.com>
To: Yu Kuai <yukuai1@...weicloud.com>, hch@...radead.org, corbet@....net,
agk@...hat.com, snitzer@...nel.org, mpatocka@...hat.com, song@...nel.org,
xni@...hat.com, hare@...e.de, colyli@...nel.org
Cc: linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
dm-devel@...ts.linux.dev, linux-raid@...r.kernel.org, yukuai3@...wei.com,
yi.zhang@...wei.com, yangerkun@...wei.com, johnny.chenyi@...wei.com
Subject: Re: [PATCH v6 md-6.18 11/11] md/md-llbitmap: introduce new lockless
bitmap
在 2025/8/26 16:52, Yu Kuai 写道:
> From: Yu Kuai <yukuai3@...wei.com>
>
> Redundant data is used to enhance data fault tolerance, and the storage
> method for redundant data vary depending on the RAID levels. And it's
> important to maintain the consistency of redundant data.
>
> Bitmap is used to record which data blocks have been synchronized and which
> ones need to be resynchronized or recovered. Each bit in the bitmap
> represents a segment of data in the array. When a bit is set, it indicates
> that the multiple redundant copies of that data segment may not be
> consistent. Data synchronization can be performed based on the bitmap after
> power failure or readding a disk. If there is no bitmap, a full disk
> synchronization is required.
>
> Key Features:
>
> - IO fastpath is lockless, if user issues lots of write IO to the same
> bitmap bit in a short time, only the first write have additional overhead
> to update bitmap bit, no additional overhead for the following writes;
> - support only resync or recover written data, means in the case creating
> new array or replacing with a new disk, there is no need to do a full disk
> resync/recovery;
>
> Key Concept:
>
> - State Machine:
>
> Each bit is one byte, contain 6 difference state, see llbitmap_state. And
> there are total 8 differenct actions, see llbitmap_action, can change state:
>
> llbitmap state machine: transitions between states
>
> | | Startwrite | Startsync | Endsync | Abortsync|
> | --------- | ---------- | --------- | ------- | ------- |
> | Unwritten | Dirty | x | x | x |
> | Clean | Dirty | x | x | x |
> | Dirty | x | x | x | x |
> | NeedSync | x | Syncing | x | x |
> | Syncing | x | Syncing | Dirty | NeedSync |
>
> | | Reload | Daemon | Discard | Stale |
> | --------- | -------- | ------ | --------- | --------- |
> | Unwritten | x | x | x | x |
> | Clean | x | x | Unwritten | NeedSync |
> | Dirty | NeedSync | Clean | Unwritten | NeedSync |
> | NeedSync | x | x | Unwritten | x |
> | Syncing | NeedSync | x | Unwritten | NeedSync |
>
> Typical scenarios:
>
> 1) Create new array
> All bits will be set to Unwritten by default, if --assume-clean is set,
> all bits will be set to Clean instead.
>
> 2) write data, raid1/raid10 have full copy of data, while raid456 doesn't and
> rely on xor data
>
> 2.1) write new data to raid1/raid10:
> Unwritten --StartWrite--> Dirty
>
> 2.2) write new data to raid456:
> Unwritten --StartWrite--> NeedSync
>
> Because the initial recover for raid456 is skipped, the xor data is not build
> yet, the bit must set to NeedSync first and after lazy initial recover is
> finished, the bit will finially set to Dirty(see 5.1 and 5.4);
>
> 2.3) cover write
> Clean --StartWrite--> Dirty
>
> 3) daemon, if the array is not degraded:
> Dirty --Daemon--> Clean
>
> For degraded array, the Dirty bit will never be cleared, prevent full disk
> recovery while readding a removed disk.
>
> 4) discard
> {Clean, Dirty, NeedSync, Syncing} --Discard--> Unwritten
>
> 5) resync and recover
>
> 5.1) common process
> NeedSync --Startsync--> Syncing --Endsync--> Dirty --Daemon--> Clean
There is some issue whith Dirty state:
1. The Dirty bit will not synced when a disk is re-add.
2. It remains Dirty even after a full recovery -- it should be Clean.
--
Thanks,
Nan
Powered by blists - more mailing lists