[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0g36Ea-XNBmsMSJxkAKz8zZNzWr_HA7AJOtS2NZOqAfEA@mail.gmail.com>
Date: Wed, 26 Nov 2025 18:21:51 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Bart Van Assche <bvanassche@....org>
Cc: Yang Yang <yang.yang@...o.com>, Jens Axboe <axboe@...nel.dk>, Pavel Machek <pavel@...nel.org>,
Len Brown <lenb@...nel.org>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Danilo Krummrich <dakr@...nel.org>, linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-pm@...r.kernel.org
Subject: Re: [PATCH 0/2] PM: runtime: Fix potential I/O hang
On Wed, Nov 26, 2025 at 5:59 PM Rafael J. Wysocki <rafael@...nel.org> wrote:
>
> On Wed, Nov 26, 2025 at 4:48 PM Bart Van Assche <bvanassche@....org> wrote:
> >
> > On 11/26/25 3:31 AM, Rafael J. Wysocki wrote:
> > > Please address the issue differently.
> >
> > It seems unfortunate to me that __pm_runtime_barrier() can cause pm_request_resume() to hang.
>
> I wouldn't call it a hang.
>
> __pm_runtime_barrier() removes the work item queued by
> pm_request_resume(), but at the time when it is called, which is
> device_suspend_late(), the work item queued by pm_request_resume()
> cannot make progress anyway. It will only be able to make progress
> when the PM workqueue is unfrozen at the end of the system resume
> transition.
>
> > Would it be safe to remove the
> > cancel_work_sync() call from __pm_runtime_barrier() since
> > pm_runtime_work() calls functions that check disable_depth
> > when processing RPM_REQ_SUSPEND and RPM_REQ_AUTOSUSPEND? Would
> > this be sufficient to fix the reported deadlock?
>
> If you want the resume work item to survive the system suspend/resume
> cycle, __pm_runtime_disable() may be changed to make that happen, but
> this still will not allow the work to make progress until the system
> resume ends.
>
> I'm not sure if this would help to address the issue at hand though.
I actually have a better idea: Why don't we resume all devices that
have runtime resume work items pending at the time when
device_suspend() is called?
Arguably, somebody wanted them to runtime-resume, so they should be
resumed before being prepared for system suspend and that will
eliminate the issue at hand (because devices cannot suspend during
system suspend/resume).
Powered by blists - more mailing lists