[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d7111bd2-d431-e5e1-1a36-6d0d4d4ec19b@quicinc.com>
Date: Wed, 12 Oct 2022 16:01:01 +0530
From: Nitin Rawat <quic_nitirawa@...cinc.com>
To: Peter Wang <peter.wang@...iatek.com>,
"Rafael J. Wysocki" <rafael@...nel.org>
CC: "Rafael J. Wysocki" <rjw@...ysocki.net>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Linux PM <linux-pm@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1] PM-runtime: Check supplier_preactivated before release
supplier
Hi Peter/Rafael,
We are also observed similiar issue on our platform. Looks like there is
a race condition(explained below) which cause consumer to resume w/o
bumping up the supplier's PM-runtime usage counter.
Process 1 (ufshcd_async_scan context)
ufshcd_async_scan()
scsi_probe_and_add_lun
scsi_add_lun
slave_configure -> enable rpm
scsi_sysfs_add_sdev
scsi_autopm_get_device
device_add <- invoked sd_probe in process 2
scsi_autopm_put_device
Process 2 (sd_probe context)
driver_probe_device
__device_attach_async_helper
__device_attach_driver
driver_probe_device
__driver_probe_device
sd_probe
scsi_autopm_get_device
Race condition for dev->power.runtime_status for consumer dev 0:0:0:0
can happen as below in rpm framework
ufshcd_async_scan context (process 1)
scsi_autopm_put_device() //0:0:0:0
pm_runtime_put_sync()
__pm_runtime_idle()
rpm_idle()
__rpm_callback()
scsi_runtime_idle()
pm_runtime_mark_last_busy()
pm_runtime_autosuspend()
__pm_runtime_suspend(RPM_AUTO)
rpm_suspend(RPM_AUTO)
status = RPM_SUSPENDING
scsi_runtime_suspend()
__rpm_callback()
status = RPM_SUSPENDED------>1
rpm_suspend_suppliers()
return -EBUSY
(use_links)&&(dev->power.runtime_status == RPM_RESUMING &&
retval)------->3
__rpm_put_suppliers()
sd_probe context (Process 2)
scsi_autopm_get_device() //0:0:0:0
__pm_runtime_resume(RPM_GET_PUT)
rpm_resume
status = RPM_RESUMING----->2
After power.runtime_status of consumer 0:0:0:0 was changed to
RPM_SUSPENDED and before scsi_runtime_idle retval was -16(EBUSY) to
__rpm_callback, power.runtime_status of consumer 0:0:0:0 was changed to
RPM_RESUMING and hence condition 3 became true and __rpm_put_suppliers
was called and hence consumer resumed with decremented usage_count due
to this race condition.
Please let me know your thoughts on this.
Regards,
Nitin
On 8/2/2022 7:03 PM, Peter Wang wrote:
>
> On 8/2/22 7:01 PM, Rafael J. Wysocki wrote:
>> On Tue, Aug 2, 2022 at 5:19 AM Peter Wang <peter.wang@...iatek.com>
>> wrote:
>>>
>>>> Hi Rafael,
>>>>
>>>> Yes, it is very clear!
>>>> I miss this important key point that usage_count is always >
>>>> rpm_active 1.
>>>> I think this patch could work.
>>>>
>>>> Thanks.
>>>> Peter
>>>>
>>>>
>>>>
>>>>
>>> Hi Rafael,
>>>
>>> After test with commit ("887371066039011144b4a94af97d9328df6869a2 PM:
>>> runtime: Fix supplier device management during consumer probe") past
>>> weeks,
>>> The supplier still suspend when consumer is active "after"
>>> pm_runtime_put_suppliers.
>>> Do you have any idea about that?
>> Well, this means that the consumer probe doesn't bump up the
>> supplier's PM-runtime usage counter as appropriate.
>>
>> You need to tell me more about what happens during the consumer probe.
>> Which driver is this?
>
> Hi Rafael,
>
> I have the same idea with you. But I still don't know how it could happen.
>
> It is upstream ufs driver in scsi system. Here is call flow
> do_scan_async (process 1)
> do_scsi_scan_host
> scsi_scan_host_selected
> scsi_scan_channel
> __scsi_scan_target
> scsi_probe_and_add_lun
> scsi_alloc_sdev
> slave_alloc -> setup link
> scsi_add_lun
> slave_configure -> enable rpm
> scsi_sysfs_add_sdev
> scsi_autopm_get_device <- get
> runtime pm
> device_add <- invoke
> sd_probe in process 2
> scsi_autopm_put_device <- put
> runtime pm, point 1
>
> driver_probe_device (process 2)
> __driver_probe_device
> pm_runtime_get_suppliers
> really_probe
> sd_probe
> scsi_autopm_get_device <- get
> runtime pm, point 2
> pm_runtime_set_autosuspend_delay <- set rpm
> delay to 2s
> scsi_autopm_put_device <- put
> runtime pm
> pm_runtime_put_suppliers <-
> (link->rpm_active = 1)
>
> After process 1 call scsi_autopm_put_device(point 1) let consumer enter
> suspend,
> process 2 call scsi_autopm_get_device(point 2) may have chance resume
> consumer but not
> bump up the supplier's PM-runtime usage counter as appropriate.
>
> Thanks.
> Peter
>
>
>
>
>
>
>
Powered by blists - more mailing lists