lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <15c6d1cf-0aaa-0a01-5bf4-0762f45d7676@huawei.com>
Date: Tue, 5 Mar 2024 19:25:33 +0800
From: yangxingui <yangxingui@...wei.com>
To: John Garry <john.g.garry@...cle.com>, Jason Yan <yanaijie@...wei.com>,
	<jejb@...ux.ibm.com>, <martin.petersen@...cle.com>,
	<damien.lemoal@...nsource.wdc.com>
CC: <linux-scsi@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<linuxarm@...wei.com>, <prime.zeng@...ilicon.com>,
	<chenxiang66@...ilicon.com>, <kangfenglong@...wei.com>
Subject: Re: [PATCH] scsi: libsas: Fix disk not being scanned in after being
 removed


Hi John,
On 2024/3/5 18:15, John Garry wrote:
> On 05/03/2024 02:56, Jason Yan wrote:
>> On 2024/3/4 20:50, yangxingui wrote:
>>> Hi Jason,
>>>
>>> On 2024/3/1 9:55, Jason Yan wrote:
>>>> On 2024/2/29 2:13, John Garry wrote:
>>>>> On 21/02/2024 07:31, Xingui Yang wrote:
>>>>>> As of commit d8649fc1c5e4 ("scsi: libsas: Do discovery on empty 
>>>>>> PHY to
>>>>>> update PHY info"), do discovery will send a new SMP_DISCOVER and 
>>>>>> update
>>>>>> phy->phy_change_count. We found that if the disk is reconnected 
>>>>>> and phy
>>>>>> change_count changes at this time, the disk scanning process will 
>>>>>> not be
>>>>>> triggered.
>>>>>>
>>>>>> So update the PHY info with the last query results.
>>>>>>
>>>>>> Fixes: d8649fc1c5e4 ("scsi: libsas: Do discovery on empty PHY to 
>>>>>> update PHY info")
>>>>>> Signed-off-by: Xingui Yang <yangxingui@...wei.com>kkkkk
>>>>>> ---
>>>>>>   drivers/scsi/libsas/sas_expander.c | 9 ++++-----
>>>>>>   1 file changed, 4 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/scsi/libsas/sas_expander.c 
>>>>>> b/drivers/scsi/libsas/sas_expander.c
>>>>>> index a2204674b680..9563f5589948 100644
>>>>>> --- a/drivers/scsi/libsas/sas_expander.c
>>>>>> +++ b/drivers/scsi/libsas/sas_expander.c
>>>>>> @@ -1681,6 +1681,10 @@ int sas_get_phy_attached_dev(struct 
>>>>>> domain_device *dev, int phy_id,
>>>>>>           if (*type == 0)
>>>>>>               memset(sas_addr, 0, SAS_ADDR_SIZE);
>>>>>>       }
>>>>>> +
>>>>>> +    if ((SAS_ADDR(sas_addr) == 0) || (res == -ECOMM))
>>>>>
>>>>> It's odd to call sas_set_ex_phy() if we got res == -ECOMM. I mean, 
>>>>> in this this case disc_resp is not filled in as the command did not 
>>>>> execute, right? I know that is what the current code does, but it 
>>>>> is strange.
>>>>
>>>> The current code actually re-send the SMP command and update the PHY 
>>>> status only when the the SMP command is responded correctly.
>>>>
>>>> Xinggui, can you please fix this and send v3?
>>> The current location cannot directly update the phy information. The 
>>> previous phy information will be used later, and the previous sas 
>>> address will be compared with the currently queried sas address. At 
>>> present, v2 is more suitable after many days of testing.
> 
> I don't understand this. Where is the previous SAS address compared to 
> the current SAS address?
> 
> Could this work:
> 
> diff --git a/drivers/scsi/libsas/sas_expander.c 
> b/drivers/scsi/libsas/sas_expander.c
> index a2204674b680..e190038ba7bd 100644
> --- a/drivers/scsi/libsas/sas_expander.c
> +++ b/drivers/scsi/libsas/sas_expander.c
> @@ -1675,11 +1675,13 @@ int sas_get_phy_attached_dev(struct 
> domain_device *dev, int phy_id,
> 
>          res = sas_get_phy_discover(dev, phy_id, disc_resp);
>          if (res == 0) {
> -               memcpy(sas_addr, disc_resp->disc.attached_sas_addr,
> -                      SAS_ADDR_SIZE);
>                  *type = to_dev_type(&disc_resp->disc);
> -               if (*type == 0)
> +               if (*type == SAS_PHY_UNUSED)
>                          memset(sas_addr, 0, SAS_ADDR_SIZE);
> +               else
> +                       memcpy(sas_addr, disc_resp->disc.attached_sas_addr,
> +                      SAS_ADDR_SIZE);
> +               sas_set_ex_phy(dev, phy_id, disc_resp);
>          }
>          kfree(disc_resp);
>          return res;
> lines 1-21/21 (END)
> 
> It's like the change in this patch.
This doesn't work properly. the previous sas address will be compared 
with the currently queried sas address and the previous phy information 
will also be used when calling sas_unregister_devs_sas_addr() after the 
sas_rediscover_dev() function calls sas_get_phy_attached_dev(). 
Therefore, it is more appropriate to update the phy information after 
the device is unregistered. as follows:
static int sas_rediscover_dev(struct domain_device *dev, int phy_id,
                               bool last, int sibling)
{
	...
        res = sas_get_phy_attached_dev(dev, phy_id, sas_addr, &type);
         switch (res) {
         case SMP_RESP_NO_PHY:
                 phy->phy_state = PHY_NOT_PRESENT;
                 sas_unregister_devs_sas_addr(dev, phy_id, last);
                 return res;
         case SMP_RESP_PHY_VACANT:
                 phy->phy_state = PHY_VACANT;
                 sas_unregister_devs_sas_addr(dev, phy_id, last);
                 return res;
         case SMP_RESP_FUNC_ACC:
                 break;
         case -ECOMM:
                 break;
         default:
                 return res;
         }

         if ((SAS_ADDR(sas_addr) == 0) || (res == -ECOMM)) {
                 phy->phy_state = PHY_EMPTY;
                 sas_unregister_devs_sas_addr(dev, phy_id, last);
                 /*
                  * Even though the PHY is empty, for convenience we 
discover
                  * the PHY to update the PHY info, like negotiated 
linkrate.
                  */
                 sas_ex_phy_discover(dev, phy_id);
                 return res;
         } else if (SAS_ADDR(sas_addr) == 
SAS_ADDR(phy->attached_sas_addr) && // <=== Compare the previous sas 
address with the current sas address
                    dev_type_flutter(type, phy->attached_dev_type)) {
                 struct domain_device *ata_dev = sas_ex_to_ata(dev, phy_id);
                 char *action = "";

                 sas_ex_phy_discover(dev, phy_id);

                 if (ata_dev && phy->attached_dev_type == SAS_SATA_PENDING)
                         action = ", needs recovery";
                 pr_debug("ex %016llx phy%02d broadcast flutter%s\n",
                          SAS_ADDR(dev->sas_addr), phy_id, action);
                 return res;
         }

> 
> 
>>
>> OK, so let me have a closer look at v2.
> 
> I have to say that v2 is quite complicated...
Yes, but it works.

Thanks,
Xingui

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ