linux-kernel - Re: [PATCH v5 00/14] dm-raid/md/raid: fix v6.7 regressions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZcE4mGXCDwjqBXgf@bmarzins-01.fast.eng.rdu2.dc.redhat.com>
Date: Mon, 5 Feb 2024 14:35:52 -0500
From: Benjamin Marzinski <bmarzins@...hat.com>
To: Yu Kuai <yukuai1@...weicloud.com>
Cc: mpatocka@...hat.com, heinzm@...hat.com, xni@...hat.com,
        blazej.kucman@...ux.intel.com, agk@...hat.com, snitzer@...nel.org,
        dm-devel@...ts.linux.dev, song@...nel.org, jbrassow@....redhat.com,
        neilb@...e.de, shli@...com, akpm@...l.org,
        linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org,
        yi.zhang@...wei.com, yangerkun@...wei.com,
        "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH v5 00/14] dm-raid/md/raid: fix v6.7 regressions

On Sun, Feb 04, 2024 at 09:35:09AM +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2024/02/03 11:19, Benjamin Marzinski 写道:
> > On Thu, Feb 01, 2024 at 05:25:45PM +0800, Yu Kuai wrote:
> > > From: Yu Kuai <yukuai3@...wei.com>
> > > I apply this patchset on top of v6.8-rc1, and run lvm2 tests suite with
> > > folling cmd for 24 round(for about 2 days):
> > > 
> > > for t in `ls test/shell`; do
> > >          if cat test/shell/$t | grep raid &> /dev/null; then
> > >                  make check T=shell/$t
> > >          fi
> > > done
> > > 
> > > failed count                             failed test
> > >        1 ###       failed: [ndev-vanilla] shell/dmsecuretest.sh
> > >        1 ###       failed: [ndev-vanilla] shell/dmsetup-integrity-keys.sh
> > >        1 ###       failed: [ndev-vanilla] shell/dmsetup-keyring.sh
> > >        5 ###       failed: [ndev-vanilla] shell/duplicate-pvs-md0.sh
> > >        1 ###       failed: [ndev-vanilla] shell/duplicate-vgid.sh
> > >        2 ###       failed: [ndev-vanilla] shell/duplicate-vgnames.sh
> > >        1 ###       failed: [ndev-vanilla] shell/fsadm-crypt.sh
> > >        1 ###       failed: [ndev-vanilla] shell/integrity.sh
> > >        6 ###       failed: [ndev-vanilla] shell/lvchange-raid1-writemostly.sh
> > >        2 ###       failed: [ndev-vanilla] shell/lvchange-rebuild-raid.sh
> > >        5 ###       failed: [ndev-vanilla] shell/lvconvert-raid-reshape-stripes-load-reload.sh
> > >        4 ###       failed: [ndev-vanilla] shell/lvconvert-raid-restripe-linear.sh
> > >        1 ###       failed: [ndev-vanilla] shell/lvconvert-raid1-split-trackchanges.sh
> > >       20 ###       failed: [ndev-vanilla] shell/lvconvert-repair-raid.sh
> > >       20 ###       failed: [ndev-vanilla] shell/lvcreate-large-raid.sh
> > >       24 ###       failed: [ndev-vanilla] shell/lvextend-raid.sh
> > > 
> > > And I ramdomly pick some tests verified by hand that these test will
> > > fail in v6.6 as well(not all tests):
> > > 
> > > shell/lvextend-raid.sh
> > > shell/lvcreate-large-raid.sh
> > > shell/lvconvert-repair-raid.sh
> > > shell/lvchange-rebuild-raid.sh
> > > shell/lvchange-raid1-writemostly.sh
> > 
> > In my testing with this patchset on top of the head of linus's tree
> > (5c24e4e9e708) I am seeing failures in
> > shell/lvconvert-raid-reshape-stripes-load-reload.sh and
> > shell/lvconvert-repair-raid.sh in about 20% of my runs. I have never
> > seen either of these these fail running on the 6.6 kernel (ffc253263a13).
> 
> This sounds quite different in my testing, as I said, the test
> 
> shell/lvconvert-repair-raid.sh is very likely to fail in v6.6 already,
> I don't know why it never fail in your testing, test log in v6.6:
> 
> | [ 1:38.162] #lvconvert-repair-raid.sh:1+ aux teardown
> | [ 1:38.162] ## teardown.......## removing stray mapped devices with names
> beginning with LVMTEST3474:
> | [ 1:39.207] .set +vx; STACKTRACE; set -vx
> | [ 1:41.448] ##lvconvert-repair-raid.sh:1+ set +vx
> | [ 1:41.448] ## - /mnt/test/lvm2/test/shell/lvconvert-repair-raid.sh:1
> | [ 1:41.449] ## 1 STACKTRACE() called from
> /mnt/test/lvm2/test/shell/lvconvert-repair-raid.sh:1
> | [ 1:41.449] ## ERROR: The test started dmeventd (3718) unexpectedly.
> 
> And the same in v6.8-rc1. Perhaps do you know how to fix this error?

Could you run the test with something like

# make check_local T=lvconvert-repair-raid.sh VERBOSE=1 > out 2>&1

and post the output.

-Ben
 
> Thanks,
> Kuai
> 
> > 
> > lvconvert-repair-raid.sh creates a raid array and then disables one if
> > its drives before there's enough time to finish the initial sync and
> > tries to repair it. This is supposed to fail (it uses dm-delay devices
> > to slow down the sync). When the test succeeds, I see things like this:
> > 
> > [ 0:13.469] #lvconvert-repair-raid.sh:161+ lvcreate --type raid10 -m 1 -i 2 -L 64 -n LV1 LVMTEST191946vg /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv1 /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv2 /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv3 /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv4
> > [ 0:13.469]   Using default stripesize 64.00 KiB.
> > [ 0:13.483]   Logical volume "LV1" created.
> > [ 0:14.042] 6,8908,1194343108,-;device-mapper: raid: Superblocks created for new raid set
> > [ 0:14.042] 5,8909,1194348704,-;md/raid10:mdX: not clean -- starting background reconstruction
> > [ 0:14.042] 6,8910,1194349443,-;md/raid10:mdX: active with 4 out of 4 devices
> > [ 0:14.042] 4,8911,1194459161,-;mdX: bitmap file is out of date, doing full recovery
> > [ 0:14.042] 6,8912,1194563810,-;md: resync of RAID array mdX
> > [ 0:14.042]   WARNING: This metadata update is NOT backed up.
> > [ 0:14.042] aux disable_dev "$dev4"
> > [ 0:14.058] #lvconvert-repair-raid.sh:163+ aux disable_dev /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv4
> > [ 0:14.058] Disabling device /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv4 (253:5)
> > [ 0:14.101] not lvconvert -y --repair $vg/$lv1
> > 
> > When it fails, I see:
> > 
> > [ 0:13.831] #lvconvert-repair-raid.sh:161+ lvcreate --type raid10 -m 1 -i 2 -L 64 -n LV1 LVMTEST192248vg /tmp/LVMTEST192248.ATcecgSGfE/dev/mapper/LVMTEST192248pv1 /tmp/LVMTEST192248.ATcecgSGfE/dev/mapper/LVMTEST192248pv2 /tmp/LVMTEST192248.ATcecgSGfE/dev/mapper/LVMTEST192248pv3 /tmp/LVMTEST192248.ATcecgSGfE/dev/mapper/LVMTEST192248pv4
> > [ 0:13.831]   Using default stripesize 64.00 KiB.
> > [ 0:13.847]   Logical volume "LV1" created.
> > [ 0:14.499]   WARNING: This metadata update is NOT backed up.
> > [ 0:14.499] 6,8925,1187444256,-;device-mapper: raid: Superblocks created for new raid set
> > [ 0:14.499] 5,8926,1187449525,-;md/raid10:mdX: not clean -- starting background reconstruction
> > [ 0:14.499] 6,8927,1187450148,-;md/raid10:mdX: active with 4 out of 4 devices
> > [ 0:14.499] 6,8928,1187452472,-;md: resync of RAID array mdX
> > [ 0:14.499] 6,8929,1187453016,-;md: mdX: resync done.
> > [ 0:14.499] 4,8930,1187555486,-;mdX: bitmap file is out of date, doing full recovery
> > [ 0:14.499] aux disable_dev "$dev4"
> > [ 0:14.515] #lvconvert-repair-raid.sh:163+ aux disable_dev /tmp/LVMTEST192248.AT
> > cecgSGfE/dev/mapper/LVMTEST192248pv4
> > [ 0:14.515] Disabling device /tmp/LVMTEST192248.ATcecgSGfE/dev/mapper/LVMTEST192
> > 248pv4 (253:5)
> > [ 0:14.554] not lvconvert -y --repair $vg/$lv1
> > 
> > To me the important looking difference (and I admit, I'm no RAID expert), is that in the
> > case where the test passes (where lvconvert fails as expected), I see
> > 
> > [ 0:14.042] 4,8911,1194459161,-;mdX: bitmap file is out of date, doing full recovery
> > [ 0:14.042] 6,8912,1194563810,-;md: resync of RAID array mdX
> > 
> > When it fails I see:
> > 
> > [ 0:14.499] 6,8928,1187452472,-;md: resync of RAID array mdX
> > [ 0:14.499] 6,8929,1187453016,-;md: mdX: resync done.
> > [ 0:14.499] 4,8930,1187555486,-;mdX: bitmap file is out of date, doing full recovery
> > 
> > Which appears to show a resync that takes no time, presumable because it happens before
> > the device notices that the bitmaps are wrong and schedules a full recovery.
> > 
> > 
> > lvconvert-raid-reshape-stripes-load-reload.sh repeatedly reloads the
> > device table during a raid reshape, and then tests the filesystem for
> > corruption afterwards. With this patchset, the filesystem is
> > occasionally corrupted.  I do not see this with the 6.6 kernel.
> > 
> > -Ben
> > > Xiao Ni also test the last version on a real machine, see [1].
> > > 
> > > [1] https://lore.kernel.org/all/CALTww29QO5kzmN6Vd+jT=-8W5F52tJjHKSgrfUc1Z1ZAeRKHHA@mail.gmail.com/
> > > 
> > > Yu Kuai (14):
> > >    md: don't ignore suspended array in md_check_recovery()
> > >    md: don't ignore read-only array in md_check_recovery()
> > >    md: make sure md_do_sync() will set MD_RECOVERY_DONE
> > >    md: don't register sync_thread for reshape directly
> > >    md: don't suspend the array for interrupted reshape
> > >    md: fix missing release of 'active_io' for flush
> > >    md: export helpers to stop sync_thread
> > >    md: export helper md_is_rdwr()
> > >    dm-raid: really frozen sync_thread during suspend
> > >    md/dm-raid: don't call md_reap_sync_thread() directly
> > >    dm-raid: add a new helper prepare_suspend() in md_personality
> > >    md/raid456: fix a deadlock for dm-raid456 while io concurrent with
> > >      reshape
> > >    dm-raid: fix lockdep waring in "pers->hot_add_disk"
> > >    dm-raid: remove mddev_suspend/resume()
> > > 
> > >   drivers/md/dm-raid.c |  78 +++++++++++++++++++--------
> > >   drivers/md/md.c      | 126 +++++++++++++++++++++++++++++--------------
> > >   drivers/md/md.h      |  16 ++++++
> > >   drivers/md/raid10.c  |  16 +-----
> > >   drivers/md/raid5.c   |  61 +++++++++++----------
> > >   5 files changed, 192 insertions(+), 105 deletions(-)
> > > 
> > > -- 
> > > 2.39.2
> > > 
> > 
> > .
> >