lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <50818674-e5b7-415f-a023-40611ea10850@quicinc.com>
Date:   Fri, 1 Dec 2023 20:13:29 +0800
From:   Can Guo <quic_cang@...cinc.com>
To:     Manivannan Sadhasivam <mani@...nel.org>,
        Ziqi Chen <quic_ziqichen@...cinc.com>
CC:     <quic_asutoshd@...cinc.com>, <bvanassche@....org>,
        <beanhuo@...ron.com>, <avri.altman@....com>,
        <junwoo80.lee@...sung.com>, <martin.petersen@...cle.com>,
        <quic_nguyenb@...cinc.com>, <quic_nitirawa@...cinc.com>,
        <quic_rampraka@...cinc.com>, <linux-scsi@...r.kernel.org>,
        Andy Gross <agross@...nel.org>,
        Bjorn Andersson <andersson@...nel.org>,
        Konrad Dybcio <konrad.dybcio@...aro.org>,
        "James E.J. Bottomley" <jejb@...ux.ibm.com>,
        "open list:ARM/QUALCOMM SUPPORT" <linux-arm-msm@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] scsi: ufs: qcom: move ufs_qcom_host_reset() to
 ufs_qcom_device_reset()

Hi Mani,

On 12/1/2023 1:18 PM, Manivannan Sadhasivam wrote:
> On Wed, Nov 29, 2023 at 08:10:57PM +0800, Ziqi Chen wrote:
>>
>>
>> On 11/28/2023 7:27 PM, Manivannan Sadhasivam wrote:
>>> On Tue, Nov 28, 2023 at 03:40:57AM +0800, Ziqi Chen wrote:
>>>>
>>>>
>>>> On 11/22/2023 2:14 PM, Can Guo wrote:
>>>>>
>>>>>
>>>>> On 10/25/2023 3:41 PM, Manivannan Sadhasivam wrote:
>>>>>> On Tue, Oct 24, 2023 at 07:10:15PM +0800, Ziqi Chen wrote:
>>>>>>> During PISI test, we found the issue that host Tx still bursting after
>>>>>>
>>>>>> What is PISI test?
>>>>
>>>> SI measurement.
>>>>
>>>
>>> Please expand it in the patch description.
>>
>> Sure, I will update in next patch version.
>>
>>>
>>>>>>
>>>>>>> H/W reset. Move ufs_qcom_host_reset() to ufs_qcom_device_reset() and
>>>>>>> reset host before device reset to stop tx burst.
>>>>>>>
>>>>>>
>>>>>> device_reset() callback is supposed to reset only the device and not
>>>>>> the host.
>>>>>> So NACK for this patch.
>>>>>
>>>>> Agree, the change should come in a more reasonable way.
>>>>>
>>>>> Actually, similar code is already there in ufs_mtk_device_reset() in
>>>>> ufs-mediatek.c, I guess here is trying to mimic that fashion.
>>>>>
>>>>> This change, from its functionality point of view, we do need it,
>>>>> because I occasionally (2 out of 10) hit PHY error on lane 0 during
>>>>> reboot test (in my case, I tried SM8350, SM8450 and SM8550, all same).
>>>>>
>>>>> [    1.911188] [DEBUG]ufshcd_update_uic_error: UECPA:0x80000002
>>>>> [    1.922843] [DEBUG]ufshcd_update_uic_error: UECDL:0x80004000
>>>>> [    1.934473] [DEBUG]ufshcd_update_uic_error: UECN:0x0
>>>>> [    1.944688] [DEBUG]ufshcd_update_uic_error: UECT:0x0
>>>>> [    1.954901] [DEBUG]ufshcd_update_uic_error: UECDME:0x0
>>>>>
>>>>> I found out that the PHY error pops out right after UFS device gets
>>>>> reset in the 2nd init. After having this change in place, the PA/DL
>>>>> errors are gone.
>>>>
>>>> Hi Mani,
>>>>
>>>> There is another way that adding a new vops that call XXX_host_reset() from
>>>> soc vendor driver. in this way, we can call this vops in core layer without
>>>> the dependency of device reset.
>>>> due to we already observed such error and received many same reports from
>>>> different OEMs, we need to fix it in some way.
>>>> if you think above way is available, I will update new patch in soon. Or
>>>> could you give us other suggestion?
>>>>
>>>
>>> First, please describe the issue in detail. How the issue is getting triggered
>>> and then justify your change. I do not have access to the bug reports that you
>>> received.
>>
>>  From the waveform measured by Samsung , we can see at the end of 2nd Link
>> Startup, host still keep bursting after H/W reset. This abnormal timing
>> would cause the PA/DL error mentioned by Can.
>>
>> On the other hand, at the end of 1st Link start up, Host ends bursting at
>> first and then sends H/W reset to device. So Samsung suggested to do host
>> reset before every time device reset to fix this issue. That's what you saw
>> in this patch.  This patch has been verified by OEMs.
>>
> 
> Thanks for the detail. This info should have been part of the patch description.
> 
>> So do you think if we can keep this change with details update in commit
>> message. or need to do other improvement?
>>
> 
> For sure we should not do host reset within device_reset callback. I'd like to
> know at what point of time we are seeing the host burst after device reset. I
> mean can you point me to the code in the ufshcd driver that when calling
> device_reset you are seeing the issue? Then we can do a host_reset before that
> _specific_ device_reset with the help of the new vops you suggested.

Actually, anytime when we are about to reset the device, we need to 
reset host before that, because, as Ziqi mentioned, if host is still 
bursting after device is reset, it may lead to PA/DL errors. It might be 
a bit confusing, because host can be bursting some flow control frames 
and/or dummy frames even when SW thinks it is in idle state.

The reason why the PHY error cannot be easily observed is because that 
PHY error is non-fatal, it does not trigger error handling, and there is 
no logs or prints in serial console, meaning it is silent. However, we 
have error history, in which PHY error can be recorded. Although PHY 
error is non-fatal, we don't like to see any of it, because our PHY team 
and customers are requesting zero tolerance to PHY error.

Currently, there are 3 scenarios where host reset should go before 
device reset -

1. When Linux boots up, in ufshcd_hba_init(), we reset the device. In 
this case, we need to reset the host before reset the device, because 
the previous boot stage usually leave the device and host both active 
before jumping to Linux. This is the first case which this change was 
made for at the beginning.

2. When the 2nd init kicks start in ufshcd_probe_hba(), we reset the 
device. In this case, we need to reset the host before reset the device. 
This is the case which I mentioned in my previous reply.

3. In UFS error handler, we reset the device. In this case, we need to 
reset the host before reset the device.

Thanks,
Can Guo.

> 
> - Mani
> 
>>
>> -Ziqi
>>
>>>
>>> - Mani
>>>
>>>> -Ziqi
>>>>
>>>>>
>>>>> Thanks,
>>>>> Can Guo.
>>>>>>
>>>>>> - Mani
>>>>>>
>>>>>>> Signed-off-by: Ziqi Chen <quic_ziqichen@...cinc.com>
>>>>>>> ---
>>>>>>>     drivers/ufs/host/ufs-qcom.c | 13 +++++++------
>>>>>>>     1 file changed, 7 insertions(+), 6 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/ufs/host/ufs-qcom.c b/drivers/ufs/host/ufs-qcom.c
>>>>>>> index 96cb8b5..43163d3 100644
>>>>>>> --- a/drivers/ufs/host/ufs-qcom.c
>>>>>>> +++ b/drivers/ufs/host/ufs-qcom.c
>>>>>>> @@ -445,12 +445,6 @@ static int
>>>>>>> ufs_qcom_power_up_sequence(struct ufs_hba *hba)
>>>>>>>         struct phy *phy = host->generic_phy;
>>>>>>>         int ret;
>>>>>>> -    /* Reset UFS Host Controller and PHY */
>>>>>>> -    ret = ufs_qcom_host_reset(hba);
>>>>>>> -    if (ret)
>>>>>>> -        dev_warn(hba->dev, "%s: host reset returned %d\n",
>>>>>>> -                  __func__, ret);
>>>>>>> -
>>>>>>>         /* phy initialization - calibrate the phy */
>>>>>>>         ret = phy_init(phy);
>>>>>>>         if (ret) {
>>>>>>> @@ -1709,6 +1703,13 @@ static void ufs_qcom_dump_dbg_regs(struct
>>>>>>> ufs_hba *hba)
>>>>>>>     static int ufs_qcom_device_reset(struct ufs_hba *hba)
>>>>>>>     {
>>>>>>>         struct ufs_qcom_host *host = ufshcd_get_variant(hba);
>>>>>>> +    int ret = 0;
>>>>>>> +
>>>>>>> +    /* Reset UFS Host Controller and PHY */
>>>>>>> +    ret = ufs_qcom_host_reset(hba);
>>>>>>> +    if (ret)
>>>>>>> +        dev_warn(hba->dev, "%s: host reset returned %d\n",
>>>>>>> +                  __func__, ret);
>>>>>>>         /* reset gpio is optional */
>>>>>>>         if (!host->device_reset)
>>>>>>> -- 
>>>>>>> 2.7.4
>>>>>>>
>>>>>>
>>>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ