[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <815fcddf-85cc-126e-4be1-618b5ba8f823@huawei.com>
Date: Tue, 18 Jun 2024 19:45:59 +0800
From: yangxingui <yangxingui@...wei.com>
To: John Garry <john.g.garry@...cle.com>, <yanaijie@...wei.com>,
<jejb@...ux.ibm.com>, <martin.petersen@...cle.com>,
<damien.lemoal@...nsource.wdc.com>
CC: <linux-scsi@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<linuxarm@...wei.com>, <prime.zeng@...ilicon.com>,
<chenxiang66@...ilicon.com>, <kangfenglong@...wei.com>
Subject: Re: [PATCH v3] scsi: libsas: Fix exp-attached end device cannot be
scanned in again after probe failed
Hi, John,
Thanks for your reply.
On 2024/6/18 16:55, John Garry wrote:
> On 13/06/2024 13:23, Xingui Yang wrote:
>
> Sorry for delay in responding and asking further questions.
It doesn't matter.
>
>> We found that it is judged as broadcast flutter when the exp-attached end
>> device reconnects after probe failed, as follows:
>>
>> [78779.654026] sas: broadcast received: 0
>> [78779.654037] sas: REVALIDATING DOMAIN on port 0, pid:10
>> [78779.654680] sas: ex 500e004aaaaaaa1f phy05 change count has changed
>> [78779.662977] sas: ex 500e004aaaaaaa1f phy05 originated
>> BROADCAST(CHANGE)
>> [78779.662986] sas: ex 500e004aaaaaaa1f phy05 new device attached
>> [78779.663079] sas: ex 500e004aaaaaaa1f phy05:U:8 attached:
>> 500e004aaaaaaa05 (stp)
>> [78779.693542] hisi_sas_v3_hw 0000:b4:02.0: dev[16:5] found
>> [78779.701155] sas: done REVALIDATING DOMAIN on port 0, pid:10, res 0x0
>> [78779.707864] sas: Enter sas_scsi_recover_host busy: 0 failed: 0
>> ...
>> [78835.161307] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0
>> tries: 1
>> [78835.171344] sas: sas_probe_sata: for exp-attached device
>> 500e004aaaaaaa05 returned -19
>> [78835.180879] hisi_sas_v3_hw 0000:b4:02.0: dev[16:5] is gone
>> [78835.187487] sas: broadcast received: 0
>> [78835.187504] sas: REVALIDATING DOMAIN on port 0, pid:10
>> [78835.188263] sas: ex 500e004aaaaaaa1f phy05 change count has changed
>> [78835.195870] sas: ex 500e004aaaaaaa1f phy05 originated
>> BROADCAST(CHANGE)
>> [78835.195875] sas: ex 500e004aaaaaaa1f rediscovering phy05
>> [78835.196022] sas: ex 500e004aaaaaaa1f phy05:U:A attached:
>> 500e004aaaaaaa05 (stp)
>> [78835.196026] sas: ex 500e004aaaaaaa1f phy05 broadcast flutter
>> [78835.197615] sas: done REVALIDATING DOMAIN on port 0, pid:10, res 0x0
>>
>> The cause of the problem is that the related ex_phy's
>> attached_sas_addr was
>> not cleared after the end device probe failed. In order to solve the
>> above
>> problem, a function sas_ex_unregister_end_dev() is defined to clear the
>> ex_phy information and unregister the end device after the
>> exp-attached end
>> device probe failed.
>
> Can you just manually clear the ex_phy's attached_sas_addr at the
> appropiate point (along with calling sas_unregister_dev())? It seems
> that we are using heavy-handed approach in calling
> sas_unregister_devs_sas_addr(), which does the clearing and much more.
I just tried it and it worked. If we only clear ex_phy's
attached_sas_addr, there is no need to call sas_destruct_ports(). We are
currently using sas_unregister_devs_sas_addr() which will add the port
to sas_port_del_list, so we need to call sas_destruct_ports() separately
to delete the port.
Should we also delete the port after the devices probe failed?
Maybe I can update another version and only clear ex_phy's
attached_sas_addr based on your suggestions.
>
>>
>> As devices may probe failed after done REVALIDATING DOMAIN when call
>> sas_probe_devices(). Then after its port is added to the
>> sas_port_del_list,
>> the port will not be deleted until the end of the next REVALIDATING
>> DOMAIN
>> and sas_destruct_ports() is called. A warning about creating a duplicate
>> port will occur in the new REVALIDATING DOMAIN when the end device
>> reconnects. Therefore, the previous destroy_list and sas_port_del_list
>> should be handled after devices probe failed.
>>
>> Signed-off-by: Xingui Yang <yangxingui@...wei.com>
>> ---
>> Changes since v2:
>> - Add a helper for calling sas_destruct_devices() and
>> sas_destruct_ports(),
>> and put the new call at the end of sas_probe_devices() based on John's
>> suggestion.
>>
>> Changes since v1:
>> - Simplify the process of getting ex_phy id based on Jason's suggestion.
>> - Update commit information.
>> ---
>> drivers/scsi/libsas/sas_discover.c | 32 +++++++++++++++++++-----------
>> drivers/scsi/libsas/sas_expander.c | 8 ++++++++
>> drivers/scsi/libsas/sas_internal.h | 6 +++++-
>> 3 files changed, 33 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/scsi/libsas/sas_discover.c
>> b/drivers/scsi/libsas/sas_discover.c
>> index 8fb7c41c0962..8c517e47d2b9 100644
>> --- a/drivers/scsi/libsas/sas_discover.c
>> +++ b/drivers/scsi/libsas/sas_discover.c
>> @@ -17,6 +17,22 @@
>> #include <scsi/sas_ata.h>
>> #include "scsi_sas_internal.h"
>> +static void sas_destruct_ports(struct asd_sas_port *port)
>> +{
>> + struct sas_port *sas_port, *p;
>> +
>> + list_for_each_entry_safe(sas_port, p, &port->sas_port_del_list,
>> del_list) {
>> + list_del_init(&sas_port->del_list);
>> + sas_port_delete(sas_port);
>> + }
>> +}
>> +
>> +static void sas_destruct_devices_and_ports(struct asd_sas_port *port)
>
> "and" in a function name never sounds right.
>
> Can you just call sas_destruct_port(), as it takes a port arg? Maybe
> rename sas_destruct_ports() to sas_delete_ports(), as it does "delete" -
> this may avoid some confusion in names.
As described above, if we only clear ex_phy's attached_sas_addr, we do
not need to call sas_destruct_ports().
Thanks,
Xingui
Powered by blists - more mailing lists