[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPhsuW4k_C=UxwESU4t7R+fpoAJ_HE8g_PpCJXSUGWOdbpCEoQ@mail.gmail.com>
Date: Thu, 8 Feb 2024 16:49:13 -0800
From: Song Liu <song@...nel.org>
To: Li Nan <linan666@...weicloud.com>
Cc: axboe@...nel.dk, linux-raid@...r.kernel.org, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, yukuai3@...wei.com, yi.zhang@...wei.com,
houtao1@...wei.com, yangerkun@...wei.com
Subject: Re: [PATCH] block: fix deadlock between bd_link_disk_holder and
partition scan
On Thu, Feb 8, 2024 at 12:44 AM Li Nan <linan666@...weicloud.com> wrote:
>
>
>
> 在 2024/2/8 14:50, Song Liu 写道:
> > On Wed, Feb 7, 2024 at 1:32 AM <linan666@...weicloud.com> wrote:
> >>
> >> From: Li Nan <linan122@...wei.com>
> >>
> >> 'open_mutex' of gendisk is used to protect open/close block devices. But
> >> in bd_link_disk_holder(), it is used to protect the creation of symlink
> >> between holding disk and slave bdev, which introduces some issues.
> >>
> >> When bd_link_disk_holder() is called, the driver is usually in the process
> >> of initialization/modification and may suspend submitting io. At this
> >> time, any io hold 'open_mutex', such as scanning partitions, can cause
> >> deadlocks. For example, in raid:
> >>
> >> T1 T2
> >> bdev_open_by_dev
> >> lock open_mutex [1]
> >> ...
> >> efi_partition
> >> ...
> >> md_submit_bio
> >> md_ioctl mddev_syspend
> >> -> suspend all io
> >> md_add_new_disk
> >> bind_rdev_to_array
> >> bd_link_disk_holder
> >> try lock open_mutex [2]
> >> md_handle_request
> >> -> wait mddev_resume
> >>
> >> T1 scan partition, T2 add a new device to raid. T1 waits for T2 to resume
> >> mddev, but T2 waits for open_mutex held by T1. Deadlock occurs.
> >>
> >> Fix it by introducing a local mutex 'holder_mutex' to replace 'open_mutex'.
> >
> > Is this to fix [1]? Do we need some Fixes and/or Closes tags?
> >
>
> No. Just use another way to fix [2], and both [2] and this patch can fix
> the issue. I am not sure about the root cause of [1] yet.
>
> [2] https://patchwork.kernel.org/project/linux-raid/list/?series=812045
>
> > Could you please add steps to reproduce this issue?
>
> We need to modify the kernel, add sleep in md_submit_bio() and md_ioctl()
> as below, and then:
> 1. mdadm -CR /dev/md0 -l1 -n2 /dev/sd[bc] #create a raid
> 2. echo 1 > /sys/module/md_mod/parameters/error_inject #enable sleep
> 3. 'mdadm --add /dev/md0 /dev/sda' #add a disk to raid
> 4. submit ioctl BLKRRPART to raid within 10s.
The analysis makes sense. I also hit the issue a couple times without adding
extra delays. But I am not sure whether this is the best fix (I didn't find real
issues with it either).
Maybe we don't need to suspend the array for ADD_NEW_DISK? So that
something like the following might just work?
Thanks,
Song
@@ -7573,7 +7577,6 @@ static inline bool md_ioctl_valid(unsigned int cmd)
static bool md_ioctl_need_suspend(unsigned int cmd)
{
switch (cmd) {
- case ADD_NEW_DISK:
case HOT_ADD_DISK:
case HOT_REMOVE_DISK:
case SET_BITMAP_FILE:
Powered by blists - more mailing lists