[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250929112251.72ab759d.pasic@linux.ibm.com>
Date: Mon, 29 Sep 2025 11:22:51 +0200
From: Halil Pasic <pasic@...ux.ibm.com>
To: Dust Li <dust.li@...ux.alibaba.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet
<edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni
<pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Jonathan Corbet
<corbet@....net>,
"D. Wythe" <alibuda@...ux.alibaba.com>,
Sidraya Jayagond
<sidraya@...ux.ibm.com>,
Wenjia Zhang <wenjia@...ux.ibm.com>,
Mahanta
Jambigi <mjambigi@...ux.ibm.com>,
Tony Lu <tonylu@...ux.alibaba.com>, Wen
Gu <guwen@...ux.alibaba.com>,
Guangguan Wang
<guangguan.wang@...ux.alibaba.com>,
netdev@...r.kernel.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
linux-s390@...r.kernel.org, Halil Pasic
<pasic@...ux.ibm.com>
Subject: Re: [PATCH net-next v5 2/2] net/smc: handle -ENOMEM from
smc_wr_alloc_link_mem gracefully
On Mon, 29 Sep 2025 09:50:52 +0800
Dust Li <dust.li@...ux.alibaba.com> wrote:
> >@@ -175,6 +175,8 @@ struct smc_link {
> > struct completion llc_testlink_resp; /* wait for rx of testlink */
> > int llc_testlink_time; /* testlink interval */
> > atomic_t conn_cnt; /* connections on this link */
> >+ u16 max_send_wr;
> >+ u16 max_recv_wr;
>
> Here, you've moved max_send_wr/max_recv_wr from the link group to individual links.
> This means we can now have different max_send_wr/max_recv_wr values on two
> different links within the same link group.
Only if allocations fail. Please notice that the hunk:
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -810,6 +810,8 @@ int smcr_link_init(struct smc_link_group *lgr, struct smc_link *lnk,
lnk->clearing = 0;
lnk->path_mtu = lnk->smcibdev->pattr[lnk->ibport - 1].active_mtu;
lnk->link_id = smcr_next_link_id(lgr);
+ lnk->max_send_wr = lgr->max_send_wr;
+ lnk->max_recv_wr = lgr->max_recv_wr;
initializes the link values with the values from the lgr which are in
turn picked up form the systctls at lgr creation time. I have made an
effort to keep these values the same for each link, but in case the
allocation fails and we do back off, we can end up with different values
on the links.
The alternative would be to throw in the towel, and not create
a second link if we can't match what worked for the first one.
>
> Since in Alibaba we doesn't use multi-link configurations, we haven't tested
> this scenario. Have you tested the link-down handling process in a multi-link
> setup?
>
Mahanta was so kind to do most of the testing on this. I don't think
I've tested this myself. @Mahanta: Would you be kind to give this a try
if it wasn't covered in the past? The best way is probably to modify
the code to force such a scenario. I don't think it is easy to somehow
trigger in the wild.
BTW I don't expect any problems. I think at worst the one link would
end up giving worse performance than the other, but I guess that can
happen for other reasons as well (like different HW for the two links).
But I think getting some sort of a query interface which would tell
us how much did we end up with down the road would be a good idea anyway.
And I hope we can switch to vmalloc down the road as well, which would
make back off less likely.
> Otherwise, the patch looks good to me.
>
Thank you very much!
Regards,
Halil
Powered by blists - more mailing lists