[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7fa875ee-ceed-e2dd-20fc-976e043e08ab@huaweicloud.com>
Date: Fri, 25 Oct 2024 15:17:06 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Mariusz Tkaczyk <mariusz.tkaczyk@...ux.intel.com>,
Yu Kuai <yukuai1@...weicloud.com>
Cc: song@...nel.org, linux-raid@...r.kernel.org,
linux-kernel@...r.kernel.org, yi.zhang@...wei.com, yangerkun@...wei.com,
"yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH RFC 4/4] md/md-bitmap: support to build md-bitmap as
kernel module
Hi,
在 2024/10/25 15:02, Mariusz Tkaczyk 写道:
> On Thu, 24 Oct 2024 21:13:25 +0800
> Yu Kuai <yukuai1@...weicloud.com> wrote:
>
>> From: Yu Kuai <yukuai3@...wei.com>
>>
>> Now that all implementations are internal, it's sensible to build it as
>> kernel module now.
>>
>> Signed-off-by: Yu Kuai <yukuai3@...wei.com>
>> ---
>> drivers/md/Kconfig | 15 +++++++
>> drivers/md/Makefile | 4 +-
>> drivers/md/md-bitmap.c | 28 +++++++++++-
>> drivers/md/md-bitmap.h | 8 +++-
>> drivers/md/md.c | 97 +++++++++++++++++++++++++++++++++++++-----
>> drivers/md/md.h | 1 -
>> 6 files changed, 135 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
>> index 1e9db8e4acdf..452d7292b617 100644
>> --- a/drivers/md/Kconfig
>> +++ b/drivers/md/Kconfig
>> @@ -37,6 +37,21 @@ config BLK_DEV_MD
>>
>> If unsure, say N.
>>
>> +config MD_BITMAP
>> + tristate "RAID bitmap support"
>
> Maybe "MD RAID bitmap support"? From kernel config GUI description is
> presented, it seems better to highlight that it is MD internal.
>
>> + default y
>> + depends on BLK_DEV_MD
>> + help
>> + If you say Y here, support for the write intent bitmap will be
>> + enabled. The bitmap will be used to record the data regions that
>
> "If you say Y here" is confusing because it could be "Y" or "M".
> Maybe:
Yeah, I just copy this from other configs.
>
> "MD write intent bitmap support. The bitmap can be used to
> optimize resync speed after power failure, limiting it to recorded dirty
> sectors in bitmap. This feature can be added to existing MD array or MD array
> can be created with bitmap via mdadm(8).
> If unsure, say M."
>
> "M" because MD is in real life is often compiled as module- shouldn't it be
> always same?
> We should not allow "Y" if MD is "M", perhaps we need to do more in Kconfig to
> prevent this?
Kconfig already do this, if the depends config is 'M', this config can't
be set to 'Y'.
>
> It is always good to refer how it can be configured and who uses it in
> userspace. You can eventually add that it is MD internal module if you don't
> see it enough clear.
>
> On other think- what about recovery? Isn't it used to improve recovery speed? We
> have checkpointing for recovery. If I remember correctly it is always 7
> checkpoints for the array (at least for IMSM). Isn't bitmap used to improve
> this? If yes, please add it here.
For recovery means add a new disk to the array? If so, bitmap is useless
in this case, if you means readd a hot removed disk, then yes, however,
I think this is actually 'resync'.
Anyway, I'll add both power failure and readd a disk in the next
version.
>
> The part below should go to the Documentation because these are implementation
> details that are not needed to take a decision about enabling/disabling the
> module.
> Please consider adding Documentation entry.
I probably will delay te Documentation untill the new bitmap, I'll just
remove the below.
Thanks,
Kuai
>
> For the code- lgtm.
>
> Great job Kuai! I love your contribution.
> Thanks,
> Mariusz
>
>> need
>> + to be resynced after a power failure, preventing a full disk
>> resync.
>> + The bitmap size ranges from 4K to 132K, depending on the array
>> size.
>> + Each bit corresponds to 2 bytes of data and is managed in
>> + self-maintained memory. All bits are protected by a disk-level
>> + spinlock.
>> +
>> + If unsure, say Y.
>> +
>> config MD_AUTODETECT
>> bool "Autodetect RAID arrays during kernel boot"
>> depends on BLK_DEV_MD=y
>> diff --git a/drivers/md/Makefile b/drivers/md/Makefile
>> index 476a214e4bdc..387670f766b7 100644
>> --- a/drivers/md/Makefile
>> +++ b/drivers/md/Makefile
>> @@ -27,14 +27,14 @@ dm-clone-y += dm-clone-target.o dm-clone-metadata.o
>> dm-verity-y += dm-verity-target.o
>> dm-zoned-y += dm-zoned-target.o dm-zoned-metadata.o dm-zoned-reclaim.o
>>
>> -md-mod-y += md.o md-bitmap.o
>> +md-mod-y += md.o
>> raid456-y += raid5.o raid5-cache.o raid5-ppl.o
>>
>> # Note: link order is important. All raid personalities
>> # and must come before md.o, as they each initialise
>> # themselves, and md.o may use the personalities when it
>> # auto-initialised.
>> -
>> +obj-$(CONFIG_MD_BITMAP) += md-bitmap.o
>> obj-$(CONFIG_MD_RAID0) += raid0.o
>> obj-$(CONFIG_MD_RAID1) += raid1.o
>> obj-$(CONFIG_MD_RAID10) += raid10.o
>> diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
>> index f68eb79e739d..148a479d32c0 100644
>> --- a/drivers/md/md-bitmap.c
>> +++ b/drivers/md/md-bitmap.c
>> @@ -212,6 +212,8 @@ struct bitmap {
>> int cluster_slot;
>> };
>>
>> +static struct workqueue_struct *md_bitmap_wq;
>> +
>> static int __bitmap_resize(struct bitmap *bitmap, sector_t blocks,
>> int chunksize, bool init);
>>
>> @@ -2970,6 +2972,9 @@ static struct attribute_group md_bitmap_group = {
>> };
>>
>> static struct bitmap_operations bitmap_ops = {
>> + .version = 1,
>> + .owner = THIS_MODULE,
>> +
>> .enabled = bitmap_enabled,
>> .create = bitmap_create,
>> .resize = bitmap_resize,
>> @@ -3001,7 +3006,26 @@ static struct bitmap_operations bitmap_ops = {
>> .group = &md_bitmap_group,
>> };
>>
>> -void mddev_set_bitmap_ops(struct mddev *mddev)
>> +static int __init bitmap_init(void)
>> +{
>> + md_bitmap_wq = alloc_workqueue("md_bitmap", WQ_MEM_RECLAIM |
>> WQ_UNBOUND,
>> + 0);
>> + if (!md_bitmap_wq)
>> + return -ENOMEM;
>> +
>> + INIT_LIST_HEAD(&bitmap_ops.list);
>> + register_md_bitmap(&bitmap_ops);
>> + return 0;
>> +}
>> +
>> +static void __exit bitmap_exit(void)
>> {
>> - mddev->bitmap_ops = &bitmap_ops;
>> + destroy_workqueue(md_bitmap_wq);
>> + unregister_md_bitmap(&bitmap_ops);
>> }
>> +
>> +module_init(bitmap_init);
>> +module_exit(bitmap_exit);
>> +
>> +MODULE_LICENSE("GPL");
>> +MODULE_DESCRIPTION("Bitmap for MD");
>> diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h
>> index 0c19983453c7..9d1bf3c43125 100644
>> --- a/drivers/md/md-bitmap.h
>> +++ b/drivers/md/md-bitmap.h
>> @@ -71,6 +71,10 @@ struct md_bitmap_stats {
>> };
>>
>> struct bitmap_operations {
>> + int version;
>> + struct module *owner;
>> + struct list_head list;
>> +
>> bool (*enabled)(struct mddev *mddev);
>> int (*create)(struct mddev *mddev, int slot);
>> int (*resize)(struct mddev *mddev, sector_t blocks, int chunksize);
>> @@ -110,7 +114,7 @@ struct bitmap_operations {
>> struct attribute_group *group;
>> };
>>
>> -/* the bitmap API */
>> -void mddev_set_bitmap_ops(struct mddev *mddev);
>> +void register_md_bitmap(struct bitmap_operations *op);
>> +void unregister_md_bitmap(struct bitmap_operations *op);
>>
>> #endif
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index d16a3d0f2b90..09fac65b83b8 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -83,6 +83,9 @@ static const char *action_name[NR_SYNC_ACTIONS] = {
>> static LIST_HEAD(pers_list);
>> static DEFINE_SPINLOCK(pers_lock);
>>
>> +static LIST_HEAD(bitmap_list);
>> +static DEFINE_SPINLOCK(bitmap_lock);
>> +
>> static const struct kobj_type md_ktype;
>>
>> const struct md_cluster_operations *md_cluster_ops;
>> @@ -100,7 +103,6 @@ static struct workqueue_struct *md_wq;
>> * workqueue whith reconfig_mutex grabbed.
>> */
>> static struct workqueue_struct *md_misc_wq;
>> -struct workqueue_struct *md_bitmap_wq;
>>
>> static int remove_and_add_spares(struct mddev *mddev,
>> struct md_rdev *this);
>> @@ -630,15 +632,96 @@ static void active_io_release(struct percpu_ref *ref)
>>
>> static void no_op(struct percpu_ref *r) {}
>>
>> +void register_md_bitmap(struct bitmap_operations *op)
>> +{
>> + pr_info("md: bitmap version %d registered\n", op->version);
>> +
>> + spin_lock(&bitmap_lock);
>> + list_add_tail(&op->list, &bitmap_list);
>> + spin_unlock(&bitmap_lock);
>> +}
>> +EXPORT_SYMBOL_GPL(register_md_bitmap);
>> +
>> +void unregister_md_bitmap(struct bitmap_operations *op)
>> +{
>> + pr_info("md: bitmap version %d unregistered\n", op->version);
>> +
>> + spin_lock(&bitmap_lock);
>> + list_del_init(&op->list);
>> + spin_unlock(&bitmap_lock);
>> +}
>> +EXPORT_SYMBOL_GPL(unregister_md_bitmap);
>> +
>> +static struct bitmap_operations *__find_bitmap(int version)
>> +{
>> + struct bitmap_operations *op;
>> +
>> + list_for_each_entry(op, &bitmap_list, list)
>> + if (op->version == version) {
>> + if (try_module_get(op->owner))
>> + return op;
>> + else
>> + return NULL;
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> +static struct bitmap_operations *find_bitmap(int version)
>> +{
>> + struct bitmap_operations *op = NULL;
>> +
>> + spin_lock(&bitmap_lock);
>> + op = __find_bitmap(version);
>> + spin_unlock(&bitmap_lock);
>> +
>> + if (op)
>> + return op;
>> +
>> + if (request_module("md-bitmap") != 0)
>> + return NULL;
>> +
>> + spin_lock(&bitmap_lock);
>> + op = __find_bitmap(version);
>> + spin_unlock(&bitmap_lock);
>> +
>> + return op;
>> +}
>> +
>> +/* TODO: support more versions */
>> +static int mddev_set_bitmap_ops(struct mddev *mddev)
>> +{
>> + struct bitmap_operations *op = find_bitmap(1);
>> +
>> + if (!op)
>> + return -ENODEV;
>> +
>> + mddev->bitmap_ops = op;
>> + return 0;
>> +}
>> +
>> +static void mddev_clear_bitmap_ops(struct mddev *mddev)
>> +{
>> + module_put(mddev->bitmap_ops->owner);
>> + mddev->bitmap_ops = NULL;
>> +}
>> +
>> int mddev_init(struct mddev *mddev)
>> {
>> + int ret = mddev_set_bitmap_ops(mddev);
>> +
>> + if (ret)
>> + return ret;
>>
>> if (percpu_ref_init(&mddev->active_io, active_io_release,
>> - PERCPU_REF_ALLOW_REINIT, GFP_KERNEL))
>> + PERCPU_REF_ALLOW_REINIT, GFP_KERNEL)) {
>> + mddev_clear_bitmap_ops(mddev);
>> return -ENOMEM;
>> + }
>>
>> if (percpu_ref_init(&mddev->writes_pending, no_op,
>> PERCPU_REF_ALLOW_REINIT, GFP_KERNEL)) {
>> + mddev_clear_bitmap_ops(mddev);
>> percpu_ref_exit(&mddev->active_io);
>> return -ENOMEM;
>> }
>> @@ -666,7 +749,6 @@ int mddev_init(struct mddev *mddev)
>> mddev->resync_min = 0;
>> mddev->resync_max = MaxSector;
>> mddev->level = LEVEL_NONE;
>> - mddev_set_bitmap_ops(mddev);
>>
>> INIT_WORK(&mddev->sync_work, md_start_sync);
>> INIT_WORK(&mddev->del_work, mddev_delayed_delete);
>> @@ -677,6 +759,7 @@ EXPORT_SYMBOL_GPL(mddev_init);
>>
>> void mddev_destroy(struct mddev *mddev)
>> {
>> + mddev_clear_bitmap_ops(mddev);
>> percpu_ref_exit(&mddev->active_io);
>> percpu_ref_exit(&mddev->writes_pending);
>> }
>> @@ -9898,11 +9981,6 @@ static int __init md_init(void)
>> if (!md_misc_wq)
>> goto err_misc_wq;
>>
>> - md_bitmap_wq = alloc_workqueue("md_bitmap", WQ_MEM_RECLAIM |
>> WQ_UNBOUND,
>> - 0);
>> - if (!md_bitmap_wq)
>> - goto err_bitmap_wq;
>> -
>> ret = __register_blkdev(MD_MAJOR, "md", md_probe);
>> if (ret < 0)
>> goto err_md;
>> @@ -9921,8 +9999,6 @@ static int __init md_init(void)
>> err_mdp:
>> unregister_blkdev(MD_MAJOR, "md");
>> err_md:
>> - destroy_workqueue(md_bitmap_wq);
>> -err_bitmap_wq:
>> destroy_workqueue(md_misc_wq);
>> err_misc_wq:
>> destroy_workqueue(md_wq);
>> @@ -10229,7 +10305,6 @@ static __exit void md_exit(void)
>> spin_unlock(&all_mddevs_lock);
>>
>> destroy_workqueue(md_misc_wq);
>> - destroy_workqueue(md_bitmap_wq);
>> destroy_workqueue(md_wq);
>> }
>>
>> diff --git a/drivers/md/md.h b/drivers/md/md.h
>> index 5eaac1d84523..28347fb3af18 100644
>> --- a/drivers/md/md.h
>> +++ b/drivers/md/md.h
>> @@ -972,7 +972,6 @@ struct mdu_array_info_s;
>> struct mdu_disk_info_s;
>>
>> extern int mdp_major;
>> -extern struct workqueue_struct *md_bitmap_wq;
>> void md_autostart_arrays(int part);
>> int md_set_array_info(struct mddev *mddev, struct mdu_array_info_s *info);
>> int md_add_new_disk(struct mddev *mddev, struct mdu_disk_info_s *info);
>
>
> .
>
Powered by blists - more mailing lists