linux-kernel - Re: md_raid: mdX_raid6 looping after sync

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <37a0cd0f-2613-e2de-286f-b762312f6c3e@molgen.mpg.de>
Date:   Wed, 15 Mar 2023 08:52:02 +0100
From:   Donald Buczek <buczek@...gen.mpg.de>
To:     Marc Smith <msmith626@...il.com>,
        Guoqing Jiang <guoqing.jiang@...ud.ionos.com>
Cc:     Song Liu <song@...nel.org>, linux-raid@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        it+raid@...gen.mpg.de
Subject: Re: md_raid: mdX_raid6 looping after sync_action "check" to "idle"
 transition

Hi,

I can just comment, that the simple patch I proposed at https://lore.kernel.org/linux-raid/bc342de0-98d2-1733-39cd-cc1999777ff3@molgen.mpg.de/ works for us with several different kernel versions and currently 195 raid6 jbods on 105 systems going through several "idle->sync->idle" transitions each month for over two years now.

So if you suffer from the problem and are able to add patches to the kernel you use, you might give it a try.

Best
   Donald

On 3/14/23 14:25, Marc Smith wrote:
> On Mon, Feb 8, 2021 at 7:49 PM Guoqing Jiang
> <guoqing.jiang@...ud.ionos.com> wrote:t
>>
>> Hi Donald,
>>
>> On 2/8/21 19:41, Donald Buczek wrote:
>>> Dear Guoqing,
>>>
>>> On 08.02.21 15:53, Guoqing Jiang wrote:
>>>>
>>>>
>>>> On 2/8/21 12:38, Donald Buczek wrote:
>>>>>> 5. maybe don't hold reconfig_mutex when try to unregister
>>>>>> sync_thread, like this.
>>>>>>
>>>>>>           /* resync has finished, collect result */
>>>>>>           mddev_unlock(mddev);
>>>>>>           md_unregister_thread(&mddev->sync_thread);
>>>>>>           mddev_lock(mddev);
>>>>>
>>>>> As above: While we wait for the sync thread to terminate, wouldn't it
>>>>> be a problem, if another user space operation takes the mutex?
>>>>
>>>> I don't think other places can be blocked while hold mutex, otherwise
>>>> these places can cause potential deadlock. Please try above two lines
>>>> change. And perhaps others have better idea.
>>>
>>> Yes, this works. No deadlock after >11000 seconds,
>>>
>>> (Time till deadlock from previous runs/seconds: 1723, 37, 434, 1265,
>>> 3500, 1136, 109, 1892, 1060, 664, 84, 315, 12, 820 )
>>
>> Great. I will send a formal patch with your reported-by and tested-by.
>>
>> Thanks,
>> Guoqing
> 
> I'm still hitting this issue with Linux 5.4.229 -- it looks like 1/2
> of the patches that supposedly resolve this were applied to the stable
> kernels, however, one was omitted due to a regression:
> md: don't unregister sync_thread with reconfig_mutex held (upstream
> commit 8b48ec23cc51a4e7c8dbaef5f34ebe67e1a80934)
> 
> I don't see any follow-up on the thread from June 8th 2022 asking for
> this patch to be dropped from all stable kernels since it caused a
> regression.
> 
> The patch doesn't appear to be present in the current mainline kernel
> (6.3-rc2) either. So I assume this issue is still present there, or it
> was resolved differently and I just can't find the commit/patch.
> 
> I can induce the issue by using Donald's script above which will
> eventually result in hangs:
> ...
> 147948.504621] INFO: task md_test_2.sh:68033 blocked for more than 122 seconds.
> [147948.504624]       Tainted: P           OE     5.4.229-esos.prod #1
> [147948.504624] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [147948.504625] md_test_2.sh    D    0 68033      1 0x00000004
> [147948.504627] Call Trace:
> [147948.504634]  __schedule+0x4ab/0x4f3
> [147948.504637]  ? usleep_range+0x7a/0x7a
> [147948.504638]  schedule+0x67/0x81
> [147948.504639]  schedule_timeout+0x2c/0xe5
> [147948.504643]  ? do_raw_spin_lock+0x2b/0x52
> [147948.504644]  __wait_for_common+0xc4/0x13a
> [147948.504647]  ? wake_up_q+0x40/0x40
> [147948.504649]  kthread_stop+0x9a/0x117
> [147948.504653]  md_unregister_thread+0x43/0x4d
> [147948.504655]  md_reap_sync_thread+0x1c/0x1d5
> [147948.504657]  action_store+0xc9/0x284
> [147948.504658]  md_attr_store+0x9f/0xb8
> [147948.504661]  kernfs_fop_write+0x10a/0x14c
> [147948.504664]  vfs_write+0xa0/0xdd
> [147948.504666]  ksys_write+0x71/0xba
> [147948.504668]  do_syscall_64+0x52/0x60
> [147948.504671]  entry_SYSCALL_64_after_hwframe+0x5c/0xc1
> ...
> [147948.504748] INFO: task md120_resync:135315 blocked for more than
> 122 seconds.
> [147948.504749]       Tainted: P           OE     5.4.229-esos.prod #1
> [147948.504749] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [147948.504749] md120_resync    D    0 135315      2 0x80004000
> [147948.504750] Call Trace:
> [147948.504752]  __schedule+0x4ab/0x4f3
> [147948.504754]  ? printk+0x53/0x6a
> [147948.504755]  schedule+0x67/0x81
> [147948.504756]  md_do_sync+0xae7/0xdd9
> [147948.504758]  ? remove_wait_queue+0x41/0x41
> [147948.504759]  md_thread+0x128/0x151
> [147948.504761]  ? _raw_spin_lock_irqsave+0x31/0x5d
> [147948.504762]  ? md_start_sync+0xdc/0xdc
> [147948.504763]  kthread+0xe4/0xe9
> [147948.504764]  ? kthread_flush_worker+0x70/0x70
> [147948.504765]  ret_from_fork+0x35/0x40
> ...
> 
> This happens on 'raid6' MD RAID arrays that initially have sync_action==resync.
> 
> Any guidance would be greatly appreciated.
> 
> --Marc

-- 
Donald Buczek
buczek@...gen.mpg.de
Tel: +49 30 8413 1433