lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5d71468a-e5c3-4a85-b985-466bae6af70e@oss.qualcomm.com>
Date: Wed, 3 Dec 2025 11:38:52 +0800
From: Cong Zhang <cong.zhang@....qualcomm.com>
To: Ming Lei <ming.lei@...hat.com>
Cc: Jens Axboe <axboe@...nel.dk>, Daniel Wagner <dwagner@...e.de>,
        Hannes Reinecke <hare@...e.de>, linux-arm-msm@...r.kernel.org,
        linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
        pavan.kondeti@....qualcomm.com
Subject: Re: [PATCH] blk-mq: Abort suspend when wakeup events are pending



On 12/2/2025 8:29 PM, Ming Lei wrote:
> On Tue, Dec 02, 2025 at 05:48:21PM +0800, Cong Zhang wrote:
>>
>>
>> On 12/2/2025 5:20 PM, Ming Lei wrote:
>>> On Tue, Dec 02, 2025 at 11:56:12AM +0800, Cong Zhang wrote:
>>>> During system suspend, wakeup capable IRQs for block device can be
>>>> delayed, which can cause blk_mq_hctx_notify_offline() to hang
>>>> indefinitely while waiting for pending request to complete.
>>>> Skip the request waiting loop and abort suspend when wakeup events are
>>>> pending to prevent the deadlock.
>>>>
>>>> Fixes: bf0beec0607d ("blk-mq: drain I/O when all CPUs in a hctx are offline")
>>>> Signed-off-by: Cong Zhang <cong.zhang@....qualcomm.com>
>>>> ---
>>>> The issue was found during system suspend with a no_soft_reset
>>>> virtio-blk device. Here is the detailed analysis:
>>>> - When system suspend starts and no_soft_reset is enabled, virtio-blk
>>>>   does not call its suspend callback.
>>>> - Some requests are dispatched and queued. After sending the virtqueue
>>>>   notifier, the kernel waits for an IRQ to complete the request.
>>>> - The virtio-blk IRQ is wakeup-capable. When the IRQ is triggered, it
>>>>   remains pending because the device is in the suspend process.
>>>
>>> Can you explain a bit for above point? Why does the IRQ remains pending
>>> and not get handled?
>>>
>>
>> The wakeup capable IRQ is not masked during suspend. When the IRQ is
>> triggered, the kernel does not call its IRQ handler, instead kernel only
>> marks the IRQ as a wakeup event in pm_system_irq_wakeup(). By checking
>> pm_wakeup_pending() suspend process can abort if a wakeup event is
>> detected. That means the actual IRQ handler is not called during the
>> checking of blk_mq_hctx_has_requests, which cause the issue.
> 
> Thanks for the explanation!
> 
> Can you document it around `if (pm_wakeup_pending)`?
> 
> Otherwise, this patch looks fine for me.
> 

Thanks for your comments! Update in the new patchset.

> 
> Thanks,
> Ming
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ