[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4c5347ff-779b-48d7-8234-2aac9992f487@linux.ibm.com>
Date: Fri, 5 Sep 2025 19:42:19 +0530
From: Mahanta Jambigi <mjambigi@...ux.ibm.com>
To: Halil Pasic <pasic@...ux.ibm.com>, Dust Li <dust.li@...ux.alibaba.com>
Cc: Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>,
"D. Wythe" <alibuda@...ux.alibaba.com>,
Sidraya Jayagond <sidraya@...ux.ibm.com>,
Wenjia Zhang
<wenjia@...ux.ibm.com>,
Tony Lu <tonylu@...ux.alibaba.com>, Wen Gu <guwen@...ux.alibaba.com>,
netdev@...r.kernel.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
linux-s390@...r.kernel.org
Subject: Re: [PATCH net-next 1/2] net/smc: make wr buffer count configurable
On 05/09/25 5:31 pm, Halil Pasic wrote:
> On Fri, 5 Sep 2025 11:00:59 +0200
> Halil Pasic <pasic@...ux.ibm.com> wrote:
>
>>> 1. What if the two sides have different max_send_wr/max_recv_wr configurations?
>>> IIUC, For example, if the client sets max_send_wr to 64, but the server sets
>>> max_recv_wr to 16, the client might overflow the server's QP receive
>>> queue, potentially causing an RNR (Receiver Not Ready) error.
>>
>> I don't think the 16 is spec-ed anywhere and if the client and the server
>> need to agree on the same value it should either be speced, or a
>> protocol mechanism for negotiating it needs to exist. So what is your
>> take on this as an SMC maintainer?
>>
>> I think, we have tested heterogeneous setups and didn't see any grave
>> issues. But let me please do a follow up on this. Maybe the other
>> maintainers can chime in as well.
>
> Did some research and some thinking. Are you concerned about a
> performance regression for e.g. 64 -> 16 compared to 16 -> 16? According
> to my current understanding the RNR must not lead to a catastrophic
> failure, but the RDMA/IB stack is supposed to handle that.
>
Hi Dust,
I configured a client-server setup & did some SMC-R testing by setting
the values you proposed. Ran iperf3(using smc_run) with max parallel
connections of 128 & it looks good. No tcp fallback. No obvious errors.
As Halil mentioned I don't see any catastrophic failure here. Let me
know if I need to stress the system by some more tests or any specific
test that you can think may cause RNR errors. The setup is ready & I can
try it.
*Client* side logs:
[root@...ent ~]$ sysctl net.smc.smcr_max_send_wr
net.smc.smcr_max_send_wr = 64
[root@...ent ~]$
[root@...ent ~]$ smc_run iperf3 -P 128 -t 120 -c 10.25.0.72
Connecting to host 10.25.0.72, port 5201
[ 5] local 10.25.0.73 port 52544 connected to 10.25.0.72 port 5201
[ 7] local 10.25.0.73 port 52558 connected to 10.25.0.72 port 5201
*Server* side logs:
[root@...ver ~]$ sysctl net.smc.smcr_max_recv_wr
net.smc.smcr_max_recv_wr = 16
[root@...ent ~]$
[root@...ver~]$ smc_run iperf3 -s
-----------------------------------------------------------
Server listening on 5201 (test #1)
-----------------------------------------------------------
Powered by blists - more mailing lists