[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALTww289ZzZP5TmD5qezaYZV0Mnb90abqMqR=OnAzRz3NkmhQQ@mail.gmail.com>
Date: Thu, 6 Nov 2025 20:35:38 +0800
From: Xiao Ni <xni@...hat.com>
To: yukuai@...as.com
Cc: Li Nan <linan666@...weicloud.com>, corbet@....net, song@...nel.org, hare@...e.de,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-raid@...r.kernel.org, yangerkun@...wei.com, yi.zhang@...wei.com
Subject: Re: [PATCH v9 4/5] md: add check_new_feature module parameter
On Thu, Nov 6, 2025 at 11:45 AM Yu Kuai <yukuai@...as.com> wrote:
>
> Hi,
>
> 在 2025/11/4 15:17, Xiao Ni 写道:
> > On Tue, Nov 4, 2025 at 10:52 AM Li Nan <linan666@...weicloud.com> wrote:
> >>
> >>
> >> 在 2025/11/4 9:47, Xiao Ni 写道:
> >>> On Mon, Nov 3, 2025 at 9:06 PM <linan666@...weicloud.com> wrote:
> >>>> From: Li Nan <linan122@...wei.com>
> >>>>
> >>>> Raid checks if pad3 is zero when loading superblock from disk. Arrays
> >>>> created with new features may fail to assemble on old kernels as pad3
> >>>> is used.
> >>>>
> >>>> Add module parameter check_new_feature to bypass this check.
> >>>>
> >>>> Signed-off-by: Li Nan <linan122@...wei.com>
> >>>> ---
> >>>> drivers/md/md.c | 12 +++++++++---
> >>>> 1 file changed, 9 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/drivers/md/md.c b/drivers/md/md.c
> >>>> index dffc6a482181..5921fb245bfa 100644
> >>>> --- a/drivers/md/md.c
> >>>> +++ b/drivers/md/md.c
> >>>> @@ -339,6 +339,7 @@ static int start_readonly;
> >>>> */
> >>>> static bool create_on_open = true;
> >>>> static bool legacy_async_del_gendisk = true;
> >>>> +static bool check_new_feature = true;
> >>>>
> >>>> /*
> >>>> * We have a system wide 'event count' that is incremented
> >>>> @@ -1850,9 +1851,13 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_
> >>>> }
> >>>> if (sb->pad0 ||
> >>>> sb->pad3[0] ||
> >>>> - memcmp(sb->pad3, sb->pad3+1, sizeof(sb->pad3) - sizeof(sb->pad3[1])))
> >>>> - /* Some padding is non-zero, might be a new feature */
> >>>> - return -EINVAL;
> >>>> + memcmp(sb->pad3, sb->pad3+1, sizeof(sb->pad3) - sizeof(sb->pad3[1]))) {
> >>>> + pr_warn("Some padding is non-zero on %pg, might be a new feature\n",
> >>>> + rdev->bdev);
> >>>> + if (check_new_feature)
> >>>> + return -EINVAL;
> >>>> + pr_warn("check_new_feature is disabled, data corruption possible\n");
> >>>> + }
> >>>>
> >>>> rdev->preferred_minor = 0xffff;
> >>>> rdev->data_offset = le64_to_cpu(sb->data_offset);
> >>>> @@ -10704,6 +10709,7 @@ module_param(start_dirty_degraded, int, S_IRUGO|S_IWUSR);
> >>>> module_param_call(new_array, add_named_array, NULL, NULL, S_IWUSR);
> >>>> module_param(create_on_open, bool, S_IRUSR|S_IWUSR);
> >>>> module_param(legacy_async_del_gendisk, bool, 0600);
> >>>> +module_param(check_new_feature, bool, 0600);
> >>>>
> >>>> MODULE_LICENSE("GPL");
> >>>> MODULE_DESCRIPTION("MD RAID framework");
> >>>> --
> >>>> 2.39.2
> >>>>
> >>> Hi
> >>>
> >>> Thanks for finding this problem in time. The default of this kernel
> >>> module is true. I don't think people can check new kernel modules
> >>> after updating to a new kernel. They will find the array can't
> >>> assemble and report bugs. You already use pad3, is it good to remove
> >>> the check about pad3 directly here?
> >>>
> >>> By the way, have you run the regression tests?
> >>>
> >>> Regards
> >>> Xiao
> >>>
> >>>
> >>> .
> >> Hi Xiao.
> >>
> >> Thanks for your review.
> >>
> >> Deleting this check directly is risky. For example, in configurable LBS:
> >> if user sets LBS to 4K, the LBS of a RAID array assembled on old kernel
> >> becomes 512. Forcing use of this array then risks data loss -- the
> >> original issue this feature want to solve.
> > You're right, we can't delete the check.
> > For the old kernel, the array which has specified logical size can't
> > be assembled. This patch still can't fix this problem, because it is
> > an old kernel and this patch is for a new kernel, right?
> > For existing arrays, they don't have such problems. They can be
> > assembled after updating to a new kernel.
> > So, do we need this patch?
>
> There is a use case for us that user may create the array with old kernel, and
> then if something bad happened in the system(may not be related to the array),
> user may update to mainline releases and later switch back to our release. We
> want a solution that user can still use the array in this case.
Hi all
Let me check if I understand right:
1. a machine with an old kernel has problems
2. update to new kernel which has new feature
3. create an array with new kernel
4. switch back to the old kernel, so assemble fails because sb->pad3
is used and not zero.
The old kernel is right to do so. This should be expected, right?
>
> >
> >> Future features may also have similar risks, so instead of deleting this
> >> check directly, I chose to add a module parameter to give users a choice.
> >> What do you think?
> > Maybe we can add a feature bit to avoid the kernel parameter. This
> > feature bit can be set when specifying logical block size.
>
> The situation still stand, for unknown feature bit, we'd better to forbid
> assembling the array to prevent data loss by default.
If I understand correctly, the old kernel already refuses to assemble it.
Regards
Xiao
>
> Thanks,
> Kuai
>
> >
> > Regards
> > Xiao
> >> --
> >> Thanks,
> >> Nan
> >>
>
Powered by blists - more mailing lists