[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0haFgyoXqapvpESUec0_Pxw-uckTGSpVOWDQPbxWU-=Dg@mail.gmail.com>
Date: Tue, 2 Dec 2025 14:37:19 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Bart Van Assche <bvanassche@....org>
Cc: YangYang <yang.yang@...o.com>, Jens Axboe <axboe@...nel.dk>, Pavel Machek <pavel@...nel.org>,
Len Brown <lenb@...nel.org>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Danilo Krummrich <dakr@...nel.org>, linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-pm@...r.kernel.org
Subject: Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume
and runtime disable
On Tue, Dec 2, 2025 at 1:14 PM Rafael J. Wysocki <rafael@...nel.org> wrote:
>
> On Tue, Dec 2, 2025 at 1:41 AM Bart Van Assche <bvanassche@....org> wrote:
> >
> > On 12/1/25 10:47 AM, Rafael J. Wysocki wrote:
> > > Generally speaking, if blk_queue_enter() or __bio_queue_enter() may
> > > run in parallel with device_suspend_late() for q->dev, the driver of
> > > that device is defective, because it is responsible for preventing
> > > this situation from happening. The most straightforward way to
> > > achieve that is to provide a .suspend() callback for q->dev that will
> > > runtime-resume it (and, of course, q->dev will need to be prepared for
> > > system suspend as appropriate after that).
> >
> > Isn't the suspend / hibernation order such that no block I/O is
> > submitted while block devices transition to a lower power state? I'm
> > surprised to read that individual drivers are responsible for preventing
> > that blk_queue_enter() or __bio_queue_enter() run concurrently with
> > device_suspend_late().
>
> To be more precise, they don't need to be prevented from running
> concurrently with device_suspend_late() in general. The driver needs
> to ensure though that q->dev is not runtime-suspended in
> device_suspend_late() if blk_queue_enter() or __bio_queue_enter() are
> expected to run in parallel with it or later.
>
> > Regarding the UFSHCI driver: if a UFS controller is already runtime
> > suspended, we want it to remain suspended during system suspend.
>
> That can be done, but still the driver is responsible for preparing
> the device for system suspend.
>
> The most popular strategy is to use pm_runtime_force_suspend/resume()
> as driver suspend callbacks for the device, either as
> .suspend()/.resume() or as .suspend_late()/resume_early(),
> respectively. In both cases, runtime PM will be disabled and runtime
> PM callbacks will be used for stopping the device - or not, if it is
> suspended already - but after that it must not be accessed in any way
> until the resume part runs.
One more thing that needs to be said here: The PM core expects the
decision on whether or not to leave a runtime-suspended device in
suspend across system-wide suspend-resume to be made before
device_suspend_late() is called for that device. If the device is
suspended at that point, the expectation is that it will be left in
suspend. Otherwise, the expectation is that it will be taken care of
by the .suspend_late() and .suspend_noirq() callbacks (and this goes
beyond runtime PM, quite obviously).
Powered by blists - more mailing lists