[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ead8c4fcded5a1341a9ce0ca6aa471a52f6d0051.camel@yandex.ru>
Date: Thu, 01 Aug 2024 17:46:18 +0300
From: Konstantin Kharlamov <Hi-Angel@...dex.ru>
To: Bryan Gurney <bgurney@...hat.com>
Cc: Yu Kuai <yukuai1@...weicloud.com>, Song Liu <song@...nel.org>,
linux-raid@...r.kernel.org, linux-kernel@...r.kernel.org,
"yangerkun@...wei.com" <yangerkun@...wei.com>, "yukuai (C)"
<yukuai3@...wei.com>, dm-devel@...ts.linux.dev, Matthew Sakai
<msakai@...hat.com>
Subject: Re: Lockup of (raid5 or raid6) + vdo after taking out a disk under
load
On Wed, 2024-07-31 at 16:41 -0400, Bryan Gurney wrote:
> Hi Konstantin,
>
> This sounds a lot like something that I encountered with md, back in
> 2019, on the old vdo-devel mailing list:
>
> https://listman.redhat.com/archives/vdo-devel/2019-August/000171.html
>
> Basically, I had a RAID-5 md array that was in the process of
> recovery:
>
> $ cat /proc/mdstat
> Personalities : [raid0] [raid6] [raid5] [raid4]
> md0 : active raid5 sde[4] sdd[2] sdc[1] sdb[0]
> 2929890816 blocks super 1.2 level 5, 512k chunk, algorithm 2
> [4/3] [UUU_]
> [=>...................] recovery = 9.1% (89227836/976630272)
> finish=85.1min speed=173727K/sec
> bitmap: 0/8 pages [0KB], 65536KB chunk
>
> Note that the speed of the recovery is 173,727 KB/sec, which is less
> than the sync_speed_max value:
>
> $ grep . /sys/block/md0/md/sync_speed*
> /sys/block/md0/md/sync_speed:171052
> /sys/block/md0/md/sync_speed_max:200000 (system)
> /sys/block/md0/md/sync_speed_min:1000 (system)
>
> ...And when I decreased "sync_speed_max" to "65536", I stopped seeing
> hung task timeouts.
>
> There's a similar setting in dm-raid: the "--maxrecoveryrate" option
> of lvchange. So, to set the maximum recovery rate to 64 MiB per
> second per device, this would be the command, for an example VG/LV of
> "p_r5/testdmraid5"
>
> # lvchange --maxrecoveryrate 64M p_r5/testdmraid5
>
> (Older hard disk drives may not have a sequential read / write speed
> of more than 100 MiB/sec; this meant that md's default of 200 MiB/sec
> was "too fast", and would result in the recovery I/O starving the VDO
> volume from being able to service I/O.)
>
> The current value of max_recovery_rate for dm-raid can be displayed
> with "lvs -a -o +raid_max_recovery_rate".
>
> By reducing the maximum recovery rate for the dm-raid RAID-5 logical
> volume, does this result in the hung task timeouts for the
> "dm-vdo0-bioQ*" to not appear, and for the fio job to continue
> writing?
Thank you, so, I'm trying this out, and it doesn't seem to be working that well
(unless perhaps something changed in the userspace LVM since the 2.03.11 I am
using?).
So, after having executed the original steps-to-reproduce I have these two volumes:
$ lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
deco_vol p_r5 vwi-XXv-X- 100.00g vdo_internal_deco_vol
vdo_internal_deco_vol p_r5 dwi-XX--X- 20.00g
Executing the suggested lvchange command does nothing on them:
$ lvchange --maxrecoveryrate 64M p_r5/deco_vol
Command on LV p_r5/deco_vol uses options that require LV types raid .
Command not permitted on LV p_r5/deco_vol.
$ lvchange --maxrecoveryrate 64M p_r5/vdo_internal_deco_vol
Command on LV p_r5/vdo_internal_deco_vol uses options that require LV types raid .
Command not permitted on LV p_r5/vdo_internal_deco_vol.
Also, executing a `lvs -a -o +raid_max_recovery_rate` shows emptiness in place of
that field. However, this command shows various internal volumes:
$ lvs -a -o +raid_max_recovery_rate
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert MaxSync
deco_vol p_r5 vwi-XXv-X- 100.00g vdo_internal_deco_vol
vdo_internal_deco_vol p_r5 dwi-XX--X- 20.00g
[vdo_internal_deco_vol_vdata] p_r5 rwi-aor--- 20.00g 100.00
[vdo_internal_deco_vol_vdata_rimage_0] p_r5 iwi-aor--- 10.00g
[vdo_internal_deco_vol_vdata_rimage_1] p_r5 iwi-aor--- 10.00g
[vdo_internal_deco_vol_vdata_rimage_2] p_r5 iwi-aor--- 10.00g
[vdo_internal_deco_vol_vdata_rmeta_0] p_r5 ewi-aor--- 4.00m
[vdo_internal_deco_vol_vdata_rmeta_1] p_r5 ewi-aor--- 4.00m
[vdo_internal_deco_vol_vdata_rmeta_2] p_r5 ewi-aor--- 4.00m
So I tried executing the command on them:
$ lvchange --maxrecoveryrate 64M p_r5/{vdo_internal_deco_vol_vdata,vdo_internal_deco_vol_vdata_rimage_0,vdo_internal_deco_vol_vdata_rimage_1,vdo_internal_deco_vol_vdata_rimage_2,vdo_internal_deco_vol_vdata_rmeta_0,vdo_internal_deco_vol_vdata_rmeta_1,vdo_internal_deco_vol_vdata_rmeta_2}
Command on LV p_r5/vdo_internal_deco_vol_vdata_rimage_0 uses options that require LV types raid .
Command not permitted on LV p_r5/vdo_internal_deco_vol_vdata_rimage_0.
Command on LV p_r5/vdo_internal_deco_vol_vdata_rmeta_0 uses options that require LV types raid .
Command not permitted on LV p_r5/vdo_internal_deco_vol_vdata_rmeta_0.
Command on LV p_r5/vdo_internal_deco_vol_vdata_rimage_1 uses options that require LV types raid .
Command not permitted on LV p_r5/vdo_internal_deco_vol_vdata_rimage_1.
Command on LV p_r5/vdo_internal_deco_vol_vdata_rmeta_1 uses options that require LV types raid .
Command not permitted on LV p_r5/vdo_internal_deco_vol_vdata_rmeta_1.
Command on LV p_r5/vdo_internal_deco_vol_vdata_rimage_2 uses options that require LV types raid .
Command not permitted on LV p_r5/vdo_internal_deco_vol_vdata_rimage_2.
Command on LV p_r5/vdo_internal_deco_vol_vdata_rmeta_2 uses options that require LV types raid .
Command not permitted on LV p_r5/vdo_internal_deco_vol_vdata_rmeta_2.
Logical volume p_r5/vdo_internal_deco_vol_vdata changed.
This resulted in exactly one volume having changed its speed: the `[vdo_internal_deco_vol_vdata]`.
With that done, I tried removing a disk while there's load, and it results in same
old lockup reports:
-----------------------
So to sum up: the lvchange command only managed to change speed on a single internal
volume, but that didn't make the lockup go away.
Powered by blists - more mailing lists