[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a33cb54f-3414-4034-bb0a-9aeebd65f044@linux.ibm.com>
Date: Wed, 18 Oct 2023 21:43:51 +0200
From: Wenjia Zhang <wenjia@...ux.ibm.com>
To: Guangguan Wang <guangguan.wang@...ux.alibaba.com>,
dust.li@...ux.alibaba.com, kgraul@...ux.ibm.com,
jaka@...ux.ibm.com, davem@...emloft.net, edumazet@...gle.com,
kuba@...nel.org, pabeni@...hat.com
Cc: tonylu@...ux.alibaba.com, alibuda@...ux.alibaba.com,
guwen@...ux.alibaba.com, linux-s390@...r.kernel.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net v2 2/2] net/smc: correct the reason code in
smc_listen_find_device when fallback
On 18.10.23 10:35, Guangguan Wang wrote:
>
>
> On 2023/10/18 15:01, Dust Li wrote:
>> On Tue, Oct 17, 2023 at 08:42:34PM +0800, Guangguan Wang wrote:
>>> The ini->rc is used to store the last error happened when finding usable
>>> ism or rdma device in smc_listen_find_device, and is set by calling smc_
>>> find_device_store_rc. Once the ini->rc is assigned to an none-zero value,
>>> the value can not be overwritten anymore. So the ini-rc should be set to
>>> the error reason only when an error actually occurs.
>>>
>>> When finding ISM/RDMA devices, device not found is not a real error, as
>>> not all machine have ISM/RDMA devices. Failures after device found, when
What do you mean about this sentence? Do you mean that no any (ISM/RDMA)
device found is not real error? If not, what is the real reason?
>>> initializing device or when initializing connection, is real errors, and
>>> should be store in ini->rc.
>>>
>>> SMC_CLC_DECL_DIFFPREFIX also is not a real error, as for SMC-RV2, it is
>>> not require same prefix.
>>>
>>> Signed-off-by: Guangguan Wang <guangguan.wang@...ux.alibaba.com>
>>> ---
>>> net/smc/af_smc.c | 12 +++---------
>>> 1 file changed, 3 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
>>> index b3a67a168495..21e9c6ec4d01 100644
>>> --- a/net/smc/af_smc.c
>>> +++ b/net/smc/af_smc.c
>>> @@ -2163,10 +2163,8 @@ static void smc_find_ism_v2_device_serv(struct smc_sock *new_smc,
>>> }
>>> mutex_unlock(&smcd_dev_list.mutex);
>>>
>>> - if (!ini->ism_dev[0]) {
>>> - smc_find_device_store_rc(SMC_CLC_DECL_NOSMCD2DEV, ini);
>>> + if (!ini->ism_dev[0])
>>
>> Hi Guangguan,
>>
>> Generally, I think this is right. Fallback in one kind of device should
>> not be the final fallback reason.
>>
>> But what if we have more than one device and failed more than once, for
>> example:
>> Let's say we have an ISM device, a RDMA device. First we looked the ISM device
>> and failed during the initialization, we got a fallback reason A. Then we
>> looked for the RDMA device, and failed again, with another reason B.
>> Finally, we fallback to TCP. What should fallback reason be ?
>
> IMO, the order of finding devices is defined by preference. ISM v2, ISM v1, RDMA v2, RDMA v1, the former the prefer.
> I think it should return the most preferred device's failure reason if found and failed during the initialization.
> So, here should return the first reason(reason A).
>
In this case Dust mentioned, I'd prefer a reaon including A and B, like
current reason SMC_CLC_DECL_NOSMCDEV.
>>
>> OTOH, SMC_CLC_DECL_NOSMCD2DEV is only used here. Removing it would mean
>> that we would never see SMC_CLC_DECL_NOSMCD2DEV in the fallback reason,
>> which makes it meaningless.
>>
>
> Is SMC_CLC_DECL_NOSMCD2DEV really necessary? There is no reason names SMC_CLC_DECL_NOSMCR2DEV.
>
I do see the necessity for the debugging.
> Thanks,
> Guangguan Wang
>
>> Best regards,
>> Dust
>>
>>> goto not_found;
>>> - }
>>>
>>> smc_ism_get_system_eid(&eid);
>>> if (!smc_clc_match_eid(ini->negotiated_eid, smc_v2_ext,
>>> @@ -2216,9 +2214,9 @@ static void smc_find_ism_v1_device_serv(struct smc_sock *new_smc,
>>> rc = smc_listen_ism_init(new_smc, ini);
>>> if (!rc)
>>> return; /* V1 ISM device found */
>>> + smc_find_device_store_rc(rc, ini);
>>>
>>> not_found:
>>> - smc_find_device_store_rc(rc, ini);
>>> ini->smcd_version &= ~SMC_V1;
>>> ini->ism_dev[0] = NULL;
>>> ini->is_smcd = false;
>>> @@ -2267,10 +2265,8 @@ static void smc_find_rdma_v2_device_serv(struct smc_sock *new_smc,
>>> ini->smcrv2.saddr = new_smc->clcsock->sk->sk_rcv_saddr;
>>> ini->smcrv2.daddr = smc_ib_gid_to_ipv4(smc_v2_ext->roce);
>>> rc = smc_find_rdma_device(new_smc, ini);
>>> - if (rc) {
>>> - smc_find_device_store_rc(rc, ini);
>>> + if (rc)
>>> goto not_found;
>>> - }
>>> if (!ini->smcrv2.uses_gateway)
>>> memcpy(ini->smcrv2.nexthop_mac, pclc->lcl.mac, ETH_ALEN);
>>>
>>> @@ -2331,8 +2327,6 @@ static int smc_listen_find_device(struct smc_sock *new_smc,
>>>
>>> /* check for matching IP prefix and subnet length (V1) */
>>> prfx_rc = smc_listen_prfx_check(new_smc, pclc);
>>> - if (prfx_rc)
>>> - smc_find_device_store_rc(prfx_rc, ini);
>>>
>>> /* get vlan id from IP device */
>>> if (smc_vlan_by_tcpsk(new_smc->clcsock, ini))
>>> --
>>> 2.24.3 (Apple Git-128)
Powered by blists - more mailing lists