[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <68eb8e35-9619-ec78-3583-f4501ef200f8@acm.org>
Date: Thu, 13 Aug 2020 19:52:21 -0700
From: Bart Van Assche <bvanassche@....org>
To: Stanley Chu <stanley.chu@...iatek.com>
Cc: "linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
"martin.petersen@...cle.com" <martin.petersen@...cle.com>,
"avri.altman@....com" <avri.altman@....com>,
"alim.akhtar@...sung.com" <alim.akhtar@...sung.com>,
"jejb@...ux.ibm.com" <jejb@...ux.ibm.com>,
"cang@...eaurora.org" <cang@...eaurora.org>,
"beanhuo@...ron.com" <beanhuo@...ron.com>,
"asutoshd@...eaurora.org" <asutoshd@...eaurora.org>,
"matthias.bgg@...il.com" <matthias.bgg@...il.com>,
"linux-mediatek@...ts.infradead.org"
<linux-mediatek@...ts.infradead.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Kuohong Wang (王國鴻)
<kuohong.wang@...iatek.com>,
Peter Wang (王信友)
<peter.wang@...iatek.com>,
Chun-Hung Wu (巫駿宏)
<Chun-hung.Wu@...iatek.com>,
Andy Teng (鄧如宏) <Andy.Teng@...iatek.com>,
Chaotian Jing (井朝天)
<Chaotian.Jing@...iatek.com>,
CC Chou (周志杰) <cc.chou@...iatek.com>,
Jiajie Hao (郝加节)
<jiajie.hao@...iatek.com>
Subject: Re: [PATCH v7] scsi: ufs: Quiesce all scsi devices before shutdown
On 2020-08-13 01:55, Stanley Chu wrote:
> I tried many ways to come out the final solution. Currently two options
> are considered,
>
> == Option 1 ==
> pm_runtime_get_sync(hba->dev);
>
> shost_for_each_device(sdev, hba->host) {
> scsi_autopm_get_device(sdev);
> if (sdev == hba->sdev_ufs_device)
> scsi_device_quiesce(sdev);
> else
> scsi_remove_device(sdev);
> }
>
> ret = ufshcd_suspend(hba, UFS_SHUTDOWN_PM);
>
> scsi_remove_device(hba->sdev_ufs_device);
>
> Note. Using scsi_autopm_get_device() instead of pm_runtime_disable()
> is to prevent noisy message by below checking,
>
> WARN_ON_ONCE(sdev->quiesced_by && sdev->quiesced_by != current);
>
> in
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/scsi/scsi_lib.c#n2515
>
> This warning shows up if we try to quiesce a runtime-suspended SCSI
> device. This is possible during our new shutdown flow. Using
> scsi_autopm_get_device() to resume all SCSI devices first can prevent
> it.
>
> In addition, normally sd_shutdown() would be executed prior than
> ufshcd_shutdown(). If scsi_remove_device() is invoked by
> ufshcd_shutdown(), sd_shutdown() will be executed again for a SCSI disk
> by
>
> [ 131.398977] sd_shutdown+0x44/0x118
> [ 131.399416] sd_remove+0x5c/0xc4
> [ 131.399824] device_release_driver_internal+0x1c4/0x2e4
> [ 131.400481] device_release_driver+0x18/0x24
> [ 131.401018] bus_remove_device+0x108/0x134
> [ 131.401533] device_del+0x2dc/0x630
> [ 131.401973] __scsi_remove_device+0xc0/0x174
> [ 131.402510] scsi_remove_device+0x30/0x48
> [ 131.403014] ufshcd_shutdown+0xc8/0x138
>
> In this case, we could see SYNCHRONIZE_CACHE command will be sent to the
> same SCSI device twice. This is kind of wired during shutdown flow.
>
> Moreover, in consideration of performance of ufshcd_shutdown(), Option 1
> obviously degrades the latency a lot by scsi_remove_device(). Please see
> the "Performance Measurement" data below.
>
> Compared Option 2, this way is simpler and also effective. This way may
> be a better compromise.
>
> == Option 2 ==
> pm_runtime_get_sync(hba->dev);
>
> shost_for_each_device(sdev, hba->host) {
> scsi_autopm_get_device(sdev);
> scsi_device_quiesce(sdev);
> }
>
> == Performance Measurement ==
> As-Is: < 5 ms
> Option 1: 850 ms
> Option 2: 60 ms
>
> What would you prefer? Or would you have any further suggestions?
Hi Stanley,
Thanks for the detailed report and also for having shared timing information.
The approach of option 2 seems wrong to me because the SCSI devices are not
removed. My concern is that option (2) could cause the sd driver to send SYNC
and/or STOP commands to the device after its PCIe resources have been freed,
resulting in a crash.
Please take a look at the output of the following command:
$ git grep -nHA10 'struct pci_driver.* = {$' */scsi |
sed -e 's/-/:/' -e 's/-/:/' |
grep ':[[:blank:]]*\.remove'
It seems to me that other SCSI LLDs do at least the following in their PCIe
removal callback:
1. Call scsi_remove_host()
2. Call scsi_host_put()
3. Call pci_disable_device()
Would that approach work for UFS? Would offlining the UFS LUNs (SDEV_OFFLINE)
before calling the above functions make SCSI host removal faster? See also
scsi_prep_state_check().
Thanks,
Bart.
Powered by blists - more mailing lists