linux-kernel - Re: [PATCH v5 00/14] dm-raid/md/raid: fix v6.7 regressions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <528ce926-6f17-c1ea-8e77-c7d5d7f56022@huaweicloud.com>
Date: Sun, 4 Feb 2024 09:35:09 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Benjamin Marzinski <bmarzins@...hat.com>,
 Yu Kuai <yukuai1@...weicloud.com>
Cc: mpatocka@...hat.com, heinzm@...hat.com, xni@...hat.com,
 blazej.kucman@...ux.intel.com, agk@...hat.com, snitzer@...nel.org,
 dm-devel@...ts.linux.dev, song@...nel.org, jbrassow@....redhat.com,
 neilb@...e.de, shli@...com, akpm@...l.org, linux-kernel@...r.kernel.org,
 linux-raid@...r.kernel.org, yi.zhang@...wei.com, yangerkun@...wei.com,
 "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH v5 00/14] dm-raid/md/raid: fix v6.7 regressions

Hi,

在 2024/02/03 11:19, Benjamin Marzinski 写道:
> On Thu, Feb 01, 2024 at 05:25:45PM +0800, Yu Kuai wrote:
>> From: Yu Kuai <yukuai3@...wei.com>
>> I apply this patchset on top of v6.8-rc1, and run lvm2 tests suite with
>> folling cmd for 24 round(for about 2 days):
>>
>> for t in `ls test/shell`; do
>>          if cat test/shell/$t | grep raid &> /dev/null; then
>>                  make check T=shell/$t
>>          fi
>> done
>>
>> failed count                             failed test
>>        1 ###       failed: [ndev-vanilla] shell/dmsecuretest.sh
>>        1 ###       failed: [ndev-vanilla] shell/dmsetup-integrity-keys.sh
>>        1 ###       failed: [ndev-vanilla] shell/dmsetup-keyring.sh
>>        5 ###       failed: [ndev-vanilla] shell/duplicate-pvs-md0.sh
>>        1 ###       failed: [ndev-vanilla] shell/duplicate-vgid.sh
>>        2 ###       failed: [ndev-vanilla] shell/duplicate-vgnames.sh
>>        1 ###       failed: [ndev-vanilla] shell/fsadm-crypt.sh
>>        1 ###       failed: [ndev-vanilla] shell/integrity.sh
>>        6 ###       failed: [ndev-vanilla] shell/lvchange-raid1-writemostly.sh
>>        2 ###       failed: [ndev-vanilla] shell/lvchange-rebuild-raid.sh
>>        5 ###       failed: [ndev-vanilla] shell/lvconvert-raid-reshape-stripes-load-reload.sh
>>        4 ###       failed: [ndev-vanilla] shell/lvconvert-raid-restripe-linear.sh
>>        1 ###       failed: [ndev-vanilla] shell/lvconvert-raid1-split-trackchanges.sh
>>       20 ###       failed: [ndev-vanilla] shell/lvconvert-repair-raid.sh
>>       20 ###       failed: [ndev-vanilla] shell/lvcreate-large-raid.sh
>>       24 ###       failed: [ndev-vanilla] shell/lvextend-raid.sh
>>
>> And I ramdomly pick some tests verified by hand that these test will
>> fail in v6.6 as well(not all tests):
>>
>> shell/lvextend-raid.sh
>> shell/lvcreate-large-raid.sh
>> shell/lvconvert-repair-raid.sh
>> shell/lvchange-rebuild-raid.sh
>> shell/lvchange-raid1-writemostly.sh
> 
> In my testing with this patchset on top of the head of linus's tree
> (5c24e4e9e708) I am seeing failures in
> shell/lvconvert-raid-reshape-stripes-load-reload.sh and
> shell/lvconvert-repair-raid.sh in about 20% of my runs. I have never
> seen either of these these fail running on the 6.6 kernel (ffc253263a13).

This sounds quite different in my testing, as I said, the test

shell/lvconvert-repair-raid.sh is very likely to fail in v6.6 already,
I don't know why it never fail in your testing, test log in v6.6:

| [ 1:38.162] #lvconvert-repair-raid.sh:1+ aux teardown
| [ 1:38.162] ## teardown.......## removing stray mapped devices with 
names beginning with LVMTEST3474:
| [ 1:39.207] .set +vx; STACKTRACE; set -vx
| [ 1:41.448] ##lvconvert-repair-raid.sh:1+ set +vx
| [ 1:41.448] ## - /mnt/test/lvm2/test/shell/lvconvert-repair-raid.sh:1
| [ 1:41.449] ## 1 STACKTRACE() called from 
/mnt/test/lvm2/test/shell/lvconvert-repair-raid.sh:1
| [ 1:41.449] ## ERROR: The test started dmeventd (3718) unexpectedly.

And the same in v6.8-rc1. Perhaps do you know how to fix this error?

Thanks,
Kuai

> 
> lvconvert-repair-raid.sh creates a raid array and then disables one if
> its drives before there's enough time to finish the initial sync and
> tries to repair it. This is supposed to fail (it uses dm-delay devices
> to slow down the sync). When the test succeeds, I see things like this:
> 
> [ 0:13.469] #lvconvert-repair-raid.sh:161+ lvcreate --type raid10 -m 1 -i 2 -L 64 -n LV1 LVMTEST191946vg /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv1 /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv2 /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv3 /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv4
> [ 0:13.469]   Using default stripesize 64.00 KiB.
> [ 0:13.483]   Logical volume "LV1" created.
> [ 0:14.042] 6,8908,1194343108,-;device-mapper: raid: Superblocks created for new raid set
> [ 0:14.042] 5,8909,1194348704,-;md/raid10:mdX: not clean -- starting background reconstruction
> [ 0:14.042] 6,8910,1194349443,-;md/raid10:mdX: active with 4 out of 4 devices
> [ 0:14.042] 4,8911,1194459161,-;mdX: bitmap file is out of date, doing full recovery
> [ 0:14.042] 6,8912,1194563810,-;md: resync of RAID array mdX
> [ 0:14.042]   WARNING: This metadata update is NOT backed up.
> [ 0:14.042] aux disable_dev "$dev4"
> [ 0:14.058] #lvconvert-repair-raid.sh:163+ aux disable_dev /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv4
> [ 0:14.058] Disabling device /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv4 (253:5)
> [ 0:14.101] not lvconvert -y --repair $vg/$lv1
> 
> When it fails, I see:
> 
> [ 0:13.831] #lvconvert-repair-raid.sh:161+ lvcreate --type raid10 -m 1 -i 2 -L 64 -n LV1 LVMTEST192248vg /tmp/LVMTEST192248.ATcecgSGfE/dev/mapper/LVMTEST192248pv1 /tmp/LVMTEST192248.ATcecgSGfE/dev/mapper/LVMTEST192248pv2 /tmp/LVMTEST192248.ATcecgSGfE/dev/mapper/LVMTEST192248pv3 /tmp/LVMTEST192248.ATcecgSGfE/dev/mapper/LVMTEST192248pv4
> [ 0:13.831]   Using default stripesize 64.00 KiB.
> [ 0:13.847]   Logical volume "LV1" created.
> [ 0:14.499]   WARNING: This metadata update is NOT backed up.
> [ 0:14.499] 6,8925,1187444256,-;device-mapper: raid: Superblocks created for new raid set
> [ 0:14.499] 5,8926,1187449525,-;md/raid10:mdX: not clean -- starting background reconstruction
> [ 0:14.499] 6,8927,1187450148,-;md/raid10:mdX: active with 4 out of 4 devices
> [ 0:14.499] 6,8928,1187452472,-;md: resync of RAID array mdX
> [ 0:14.499] 6,8929,1187453016,-;md: mdX: resync done.
> [ 0:14.499] 4,8930,1187555486,-;mdX: bitmap file is out of date, doing full recovery
> [ 0:14.499] aux disable_dev "$dev4"
> [ 0:14.515] #lvconvert-repair-raid.sh:163+ aux disable_dev /tmp/LVMTEST192248.AT
> cecgSGfE/dev/mapper/LVMTEST192248pv4
> [ 0:14.515] Disabling device /tmp/LVMTEST192248.ATcecgSGfE/dev/mapper/LVMTEST192
> 248pv4 (253:5)
> [ 0:14.554] not lvconvert -y --repair $vg/$lv1
> 
> To me the important looking difference (and I admit, I'm no RAID expert), is that in the
> case where the test passes (where lvconvert fails as expected), I see
> 
> [ 0:14.042] 4,8911,1194459161,-;mdX: bitmap file is out of date, doing full recovery
> [ 0:14.042] 6,8912,1194563810,-;md: resync of RAID array mdX
> 
> When it fails I see:
> 
> [ 0:14.499] 6,8928,1187452472,-;md: resync of RAID array mdX
> [ 0:14.499] 6,8929,1187453016,-;md: mdX: resync done.
> [ 0:14.499] 4,8930,1187555486,-;mdX: bitmap file is out of date, doing full recovery
> 
> Which appears to show a resync that takes no time, presumable because it happens before
> the device notices that the bitmaps are wrong and schedules a full recovery.
> 
> 
> lvconvert-raid-reshape-stripes-load-reload.sh repeatedly reloads the
> device table during a raid reshape, and then tests the filesystem for
> corruption afterwards. With this patchset, the filesystem is
> occasionally corrupted.  I do not see this with the 6.6 kernel.
> 
> -Ben
>   
>> Xiao Ni also test the last version on a real machine, see [1].
>>
>> [1] https://lore.kernel.org/all/CALTww29QO5kzmN6Vd+jT=-8W5F52tJjHKSgrfUc1Z1ZAeRKHHA@mail.gmail.com/
>>
>> Yu Kuai (14):
>>    md: don't ignore suspended array in md_check_recovery()
>>    md: don't ignore read-only array in md_check_recovery()
>>    md: make sure md_do_sync() will set MD_RECOVERY_DONE
>>    md: don't register sync_thread for reshape directly
>>    md: don't suspend the array for interrupted reshape
>>    md: fix missing release of 'active_io' for flush
>>    md: export helpers to stop sync_thread
>>    md: export helper md_is_rdwr()
>>    dm-raid: really frozen sync_thread during suspend
>>    md/dm-raid: don't call md_reap_sync_thread() directly
>>    dm-raid: add a new helper prepare_suspend() in md_personality
>>    md/raid456: fix a deadlock for dm-raid456 while io concurrent with
>>      reshape
>>    dm-raid: fix lockdep waring in "pers->hot_add_disk"
>>    dm-raid: remove mddev_suspend/resume()
>>
>>   drivers/md/dm-raid.c |  78 +++++++++++++++++++--------
>>   drivers/md/md.c      | 126 +++++++++++++++++++++++++++++--------------
>>   drivers/md/md.h      |  16 ++++++
>>   drivers/md/raid10.c  |  16 +-----
>>   drivers/md/raid5.c   |  61 +++++++++++----------
>>   5 files changed, 192 insertions(+), 105 deletions(-)
>>
>> -- 
>> 2.39.2
>>
> 
> .
>