[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <acad498b-06e6-4639-b389-ef954e4c6abc@redhat.com>
Date: Wed, 1 Oct 2025 09:21:09 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Halil Pasic <pasic@...ux.ibm.com>, Dust Li <dust.li@...ux.alibaba.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Simon Horman <horms@...nel.org>, Jonathan Corbet <corbet@....net>,
"D. Wythe" <alibuda@...ux.alibaba.com>,
Sidraya Jayagond <sidraya@...ux.ibm.com>, Wenjia Zhang
<wenjia@...ux.ibm.com>, Mahanta Jambigi <mjambigi@...ux.ibm.com>,
Tony Lu <tonylu@...ux.alibaba.com>, Wen Gu <guwen@...ux.alibaba.com>,
Guangguan Wang <guangguan.wang@...ux.alibaba.com>, netdev@...r.kernel.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-rdma@...r.kernel.org, linux-s390@...r.kernel.org
Subject: Re: [PATCH net-next v5 2/2] net/smc: handle -ENOMEM from
smc_wr_alloc_link_mem gracefully
On 9/29/25 11:22 AM, Halil Pasic wrote:
> On Mon, 29 Sep 2025 09:50:52 +0800
> Dust Li <dust.li@...ux.alibaba.com> wrote:
>
>>> @@ -175,6 +175,8 @@ struct smc_link {
>>> struct completion llc_testlink_resp; /* wait for rx of testlink */
>>> int llc_testlink_time; /* testlink interval */
>>> atomic_t conn_cnt; /* connections on this link */
>>> + u16 max_send_wr;
>>> + u16 max_recv_wr;
>>
>> Here, you've moved max_send_wr/max_recv_wr from the link group to individual links.
>> This means we can now have different max_send_wr/max_recv_wr values on two
>> different links within the same link group.
>
> Only if allocations fail. Please notice that the hunk:
>
> --- a/net/smc/smc_core.c
> +++ b/net/smc/smc_core.c
> @@ -810,6 +810,8 @@ int smcr_link_init(struct smc_link_group *lgr, struct smc_link *lnk,
> lnk->clearing = 0;
> lnk->path_mtu = lnk->smcibdev->pattr[lnk->ibport - 1].active_mtu;
> lnk->link_id = smcr_next_link_id(lgr);
> + lnk->max_send_wr = lgr->max_send_wr;
> + lnk->max_recv_wr = lgr->max_recv_wr;
>
> initializes the link values with the values from the lgr which are in
> turn picked up form the systctls at lgr creation time. I have made an
> effort to keep these values the same for each link, but in case the
> allocation fails and we do back off, we can end up with different values
> on the links.
>
> The alternative would be to throw in the towel, and not create
> a second link if we can't match what worked for the first one.
>
>>
>> Since in Alibaba we doesn't use multi-link configurations, we haven't tested
>> this scenario. Have you tested the link-down handling process in a multi-link
>> setup?
>>
>
> Mahanta was so kind to do most of the testing on this. I don't think
> I've tested this myself. @Mahanta: Would you be kind to give this a try
> if it wasn't covered in the past? The best way is probably to modify
> the code to force such a scenario. I don't think it is easy to somehow
> trigger in the wild.
>
> BTW I don't expect any problems. I think at worst the one link would
> end up giving worse performance than the other, but I guess that can
> happen for other reasons as well (like different HW for the two links).
>
> But I think getting some sort of a query interface which would tell
> us how much did we end up with down the road would be a good idea anyway.
>
> And I hope we can switch to vmalloc down the road as well, which would
> make back off less likely.
Unfortunately we are closing the net-next PR right now and I would
prefer such testing being reported explicitly. Let's defer this series
to the next cycle: please re-post when net-next will reopen after Oct 12th.
Thanks,
Paolo
Powered by blists - more mailing lists