lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 2 Dec 2022 17:52:41 +0530
From:   Tushar Nimkar <quic_tnimkar@...cinc.com>
To:     "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Adrian Hunter <adrian.hunter@...el.com>
CC:     "Rafael J. Wysocki" <rafael@...nel.org>,
        Nitin Rawat <quic_nitirawa@...cinc.com>,
        <linux-pm@...r.kernel.org>, <linux-scsi@...r.kernel.org>,
        <linux-arm-msm@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        <bjorn.andersson@...nel.org>, <quic_mkshah@...cinc.com>,
        <quic_lsrao@...cinc.com>, <bvanassche@....org>,
        Peter Wang <peter.wang@...iatek.com>
Subject: Re: PM-runtime: supplier looses track of consumer during probe

Thanks Adrian and Rafael,
We are trying both patches separately. And will update result once we get.

Thanks,
Tushar Nimkar

On 12/2/2022 1:14 AM, Rafael J. Wysocki wrote:
> On Thursday, December 1, 2022 8:28:25 PM CET Rafael J. Wysocki wrote:
>> On Thu, Dec 1, 2022 at 2:10 PM Adrian Hunter <adrian.hunter@...el.com> wrote:
>>>
>>> On 29/11/22 18:56, Nitin Rawat wrote:
>>>> Hi Adrian,
>>>>
>>>> On 11/21/2022 11:38 AM, Tushar Nimkar wrote:
>>>>> Hi Adrian,
>>>>>
>>>>> On 11/18/2022 8:25 PM, Adrian Hunter wrote:
>>>>>> On 4/11/22 11:19, Tushar Nimkar wrote:
>>>>>>> Hi linux-pm/linux-scsi,
>>>>>
>>>>>>>> Process -1
>>>>>>>> ufshcd_async_scan context (process 1)
>>>>>>>> scsi_autopm_put_device() //0:0:0:0
>>>>>>
>>>>>> I am having trouble following your description.  What function is calling
>>>>>> scsi_autopm_put_device() here?
>>>>>>
>>>>> Below is flow which calls scsi_autopm_put_device()
>>>>> Process -1
>>>>> ufshcd_async_scan()
>>>>>       scsi_probe_and_add_lun()
>>>>>           scsi_add_lun()
>>>>>               slave_configure()
>>>>>                   scsi_sysfs_add_sdev()
>>>>>                       scsi_autopm_get_device()
>>>>>                           device_add()     <- invoked [Process 2] sd_probe()
>>>>>                               scsi_autopm_put_device()
>>>>>
>>>>>>>> pm_runtime_put_sync()
>>>>>>>> __pm_runtime_idle()
>>>>>>>> rpm_idle() -- RPM_GET_PUT(4)
>>>>>>>>        __rpm_callback
>>>>>>>>            scsi_runtime_idle()
>>>>>>>>                pm_runtime_mark_last_busy()
>>>>>>>>                pm_runtime_autosuspend()  --[A]
>>>>>>>>                    rpm_suspend() -- RPM_AUTO(8)
>>>>>>>>                        pm_runtime_autosuspend_expiration() use_autosuspend    is false return 0   --- [B]
>>>>>>>>                            __update_runtime_status to RPM_SUSPENDING
>>>>>>>>                        __rpm_callback()
>>>>>>>>                            __rpm_put_suppliers(dev, false)
>>>>>>>>                        __update_runtime_status to RPM_SUSPENDED
>>>>>>>>                    rpm_suspend_suppliers()
>>>>>>>>                        rpm_idle() for supplier -- RPM_ASYNC(1) return (-EAGAIN) [ Other consumer active for supplier]
>>>>>>>>                    rpm_suspend() – END with return=0
>>>>>>>>            scsi_runtime_idle() END return (-EBUSY) always.
>>>>>>
>>>>>> Not following here either.  Which device is EBUSY and why?
>>>>>
>>>>> scsi_runtime_idle() return -EBUSY always [3]
>>>>> Storage/scsi team can better explain -EBUSY implementation.
>>>>
>>>> EBUSY is returned from below code for consumer dev 0:0:0:0.
>>>> scsi_runtime_idle is called from scsi_autopm_put_device which inturn is called from ufshcd_async_scan (Process 1 as per above call stack)
>>>> static int scsi_runtime_idle(struct device *dev)
>>>> {
>>>>      :
>>>>
>>>>      if (scsi_is_sdev_device(dev)) {
>>>>          pm_runtime_mark_last_busy(dev);
>>>>          pm_runtime_autosuspend(dev);
>>>>          return -EBUSY; ---> EBUSY returned from here.
>>>>      }
>>>>
>>>>
>>>> }
>>>>
>>>>>
>>>>> [3] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/scsi/scsi_pm.c?h=next-20221118#n210
>>>>>
>>>>>
>>>>>>>>
>>>>>>>> [1]: https://lore.kernel.org/lkml/4748074.GXAFRqVoOG@kreacher/T/
>>>>>>>> [2]: https://lkml.org/lkml/2022/10/12/259
>>>
>>> It looks to me like __rpm_callback() makes assumptions about
>>> dev->power.runtime_status that are not necessarily true because
>>> dev->power.lock is dropped.
>>
>> Well, this happens because rpm_idle() calls __rpm_callback() and
>> allows it to run concurrently with rpm_suspend() and rpm_resume(), so
>> one of them may change runtime_status to RPM_SUSPENDING or
>> RPM_RESUMING while __rpm_callback() is running.
>>
>> It is somewhat questionable whether or not this should be allowed to
>> happen, but since it is generally allowed to suspend the device from
>> its .runtime_idle callback, there is not too much that can be done
>> about it.
> 
> But this means that the patch below should help too.
> 
> I actually think that we can do both, because rpm_idle() doesn't have to do
> the whole device links dance and the fact that it still calls __rpm_callback()
> is a clear oversight.
> 
> ---
>   drivers/base/power/runtime.c |   12 +++++++++++-
>   1 file changed, 11 insertions(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/base/power/runtime.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/runtime.c
> +++ linux-pm/drivers/base/power/runtime.c
> @@ -484,7 +484,17 @@ static int rpm_idle(struct device *dev,
>   
>   	dev->power.idle_notification = true;
>   
> -	retval = __rpm_callback(callback, dev);
> +	if (dev->power.irq_safe)
> +		spin_unlock(&dev->power.lock);
> +	else
> +		spin_unlock_irq(&dev->power.lock);
> +
> +	retval = callback(dev);
> +
> +	if (dev->power.irq_safe)
> +		spin_lock(&dev->power.lock);
> +	else
> +		spin_lock_irq(&dev->power.lock);
>   
>   	dev->power.idle_notification = false;
>   	wake_up_all(&dev->power.wait_queue);
> 
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ