[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5f129a61-0311-0b80-2e74-8425650bbd26@quicinc.com>
Date: Fri, 2 Dec 2022 17:52:41 +0530
From: Tushar Nimkar <quic_tnimkar@...cinc.com>
To: "Rafael J. Wysocki" <rjw@...ysocki.net>,
Adrian Hunter <adrian.hunter@...el.com>
CC: "Rafael J. Wysocki" <rafael@...nel.org>,
Nitin Rawat <quic_nitirawa@...cinc.com>,
<linux-pm@...r.kernel.org>, <linux-scsi@...r.kernel.org>,
<linux-arm-msm@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
<bjorn.andersson@...nel.org>, <quic_mkshah@...cinc.com>,
<quic_lsrao@...cinc.com>, <bvanassche@....org>,
Peter Wang <peter.wang@...iatek.com>
Subject: Re: PM-runtime: supplier looses track of consumer during probe
Thanks Adrian and Rafael,
We are trying both patches separately. And will update result once we get.
Thanks,
Tushar Nimkar
On 12/2/2022 1:14 AM, Rafael J. Wysocki wrote:
> On Thursday, December 1, 2022 8:28:25 PM CET Rafael J. Wysocki wrote:
>> On Thu, Dec 1, 2022 at 2:10 PM Adrian Hunter <adrian.hunter@...el.com> wrote:
>>>
>>> On 29/11/22 18:56, Nitin Rawat wrote:
>>>> Hi Adrian,
>>>>
>>>> On 11/21/2022 11:38 AM, Tushar Nimkar wrote:
>>>>> Hi Adrian,
>>>>>
>>>>> On 11/18/2022 8:25 PM, Adrian Hunter wrote:
>>>>>> On 4/11/22 11:19, Tushar Nimkar wrote:
>>>>>>> Hi linux-pm/linux-scsi,
>>>>>
>>>>>>>> Process -1
>>>>>>>> ufshcd_async_scan context (process 1)
>>>>>>>> scsi_autopm_put_device() //0:0:0:0
>>>>>>
>>>>>> I am having trouble following your description. What function is calling
>>>>>> scsi_autopm_put_device() here?
>>>>>>
>>>>> Below is flow which calls scsi_autopm_put_device()
>>>>> Process -1
>>>>> ufshcd_async_scan()
>>>>> scsi_probe_and_add_lun()
>>>>> scsi_add_lun()
>>>>> slave_configure()
>>>>> scsi_sysfs_add_sdev()
>>>>> scsi_autopm_get_device()
>>>>> device_add() <- invoked [Process 2] sd_probe()
>>>>> scsi_autopm_put_device()
>>>>>
>>>>>>>> pm_runtime_put_sync()
>>>>>>>> __pm_runtime_idle()
>>>>>>>> rpm_idle() -- RPM_GET_PUT(4)
>>>>>>>> __rpm_callback
>>>>>>>> scsi_runtime_idle()
>>>>>>>> pm_runtime_mark_last_busy()
>>>>>>>> pm_runtime_autosuspend() --[A]
>>>>>>>> rpm_suspend() -- RPM_AUTO(8)
>>>>>>>> pm_runtime_autosuspend_expiration() use_autosuspend is false return 0 --- [B]
>>>>>>>> __update_runtime_status to RPM_SUSPENDING
>>>>>>>> __rpm_callback()
>>>>>>>> __rpm_put_suppliers(dev, false)
>>>>>>>> __update_runtime_status to RPM_SUSPENDED
>>>>>>>> rpm_suspend_suppliers()
>>>>>>>> rpm_idle() for supplier -- RPM_ASYNC(1) return (-EAGAIN) [ Other consumer active for supplier]
>>>>>>>> rpm_suspend() – END with return=0
>>>>>>>> scsi_runtime_idle() END return (-EBUSY) always.
>>>>>>
>>>>>> Not following here either. Which device is EBUSY and why?
>>>>>
>>>>> scsi_runtime_idle() return -EBUSY always [3]
>>>>> Storage/scsi team can better explain -EBUSY implementation.
>>>>
>>>> EBUSY is returned from below code for consumer dev 0:0:0:0.
>>>> scsi_runtime_idle is called from scsi_autopm_put_device which inturn is called from ufshcd_async_scan (Process 1 as per above call stack)
>>>> static int scsi_runtime_idle(struct device *dev)
>>>> {
>>>> :
>>>>
>>>> if (scsi_is_sdev_device(dev)) {
>>>> pm_runtime_mark_last_busy(dev);
>>>> pm_runtime_autosuspend(dev);
>>>> return -EBUSY; ---> EBUSY returned from here.
>>>> }
>>>>
>>>>
>>>> }
>>>>
>>>>>
>>>>> [3] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/scsi/scsi_pm.c?h=next-20221118#n210
>>>>>
>>>>>
>>>>>>>>
>>>>>>>> [1]: https://lore.kernel.org/lkml/4748074.GXAFRqVoOG@kreacher/T/
>>>>>>>> [2]: https://lkml.org/lkml/2022/10/12/259
>>>
>>> It looks to me like __rpm_callback() makes assumptions about
>>> dev->power.runtime_status that are not necessarily true because
>>> dev->power.lock is dropped.
>>
>> Well, this happens because rpm_idle() calls __rpm_callback() and
>> allows it to run concurrently with rpm_suspend() and rpm_resume(), so
>> one of them may change runtime_status to RPM_SUSPENDING or
>> RPM_RESUMING while __rpm_callback() is running.
>>
>> It is somewhat questionable whether or not this should be allowed to
>> happen, but since it is generally allowed to suspend the device from
>> its .runtime_idle callback, there is not too much that can be done
>> about it.
>
> But this means that the patch below should help too.
>
> I actually think that we can do both, because rpm_idle() doesn't have to do
> the whole device links dance and the fact that it still calls __rpm_callback()
> is a clear oversight.
>
> ---
> drivers/base/power/runtime.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> Index: linux-pm/drivers/base/power/runtime.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/runtime.c
> +++ linux-pm/drivers/base/power/runtime.c
> @@ -484,7 +484,17 @@ static int rpm_idle(struct device *dev,
>
> dev->power.idle_notification = true;
>
> - retval = __rpm_callback(callback, dev);
> + if (dev->power.irq_safe)
> + spin_unlock(&dev->power.lock);
> + else
> + spin_unlock_irq(&dev->power.lock);
> +
> + retval = callback(dev);
> +
> + if (dev->power.irq_safe)
> + spin_lock(&dev->power.lock);
> + else
> + spin_lock_irq(&dev->power.lock);
>
> dev->power.idle_notification = false;
> wake_up_all(&dev->power.wait_queue);
>
>
>
Powered by blists - more mailing lists