lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a131af22-0a5b-4be1-b77e-8716c63e8883@oracle.com>
Date:   Thu, 9 Nov 2023 15:36:32 -0800
From:   junxiao.bi@...cle.com
To:     Tejun Heo <tj@...nel.org>
Cc:     linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org,
        jiangshanlai@...il.com, song@...nel.org
Subject: Re: [RFC] workqueue: allow system workqueue be used in memory reclaim


On 11/9/23 10:58 AM, Tejun Heo wrote:
> Hello,
>
> On Tue, Nov 07, 2023 at 05:28:21PM -0800, Junxiao Bi wrote:
>> The following deadlock was triggered on Intel IMSM raid1 volumes.
>>
>> The sequence of the event is this:
>>
>> 1. memory reclaim was waiting xfs journal flushing and get stucked by
>> md flush work.
>>
>> 2. md flush work was queued into "md" workqueue, but never get executed,
>> kworker thread can not be created and also the rescuer thread was executing
>> md flush work for another md disk and get stuck because
>> "MD_SB_CHANGE_PENDING" flag was set.
>>
>> 3. That flag should be set by some md write process which was asking to
>> update md superblock to change in_sync status to 0, and then it used
>> kernfs_notify to ask "mdmon" process to update superblock, after that,
>> write process waited that flag to be cleared.
>>
>> 4. But "mdmon" was never wake up, because kernfs_notify() depended on
>> system wide workqueue "system_wq" to do the notify, but since that
>> workqueue doesn't have a rescuer thread, notify will not happen.
> Things like this can't be fixed by adding RECLAIM to system_wq because
> system_wq is shared and someone else might occupy that rescuer thread. The
> flag doesn't guarantee unlimited forward progress. It only guarantees
> forward progress of one work item.
>
> That seems to be where the problem is in #2 in the first place. If a work
> item is required during memory reclaim, it must have guaranteed forward
> progress but it looks like that's waiting for someone else who can end up
> waiting for userspace?
>
> You'll need to untangle the dependencies earlier.
Make sense. Thanks a lot for the comments.
>
> Thanks.
>

Powered by blists - more mailing lists