[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250905140135.2487a99f.pasic@linux.ibm.com>
Date: Fri, 5 Sep 2025 14:01:35 +0200
From: Halil Pasic <pasic@...ux.ibm.com>
To: Dust Li <dust.li@...ux.alibaba.com>
Cc: Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon
Horman <horms@...nel.org>,
"D. Wythe" <alibuda@...ux.alibaba.com>,
Sidraya
Jayagond <sidraya@...ux.ibm.com>,
Wenjia Zhang <wenjia@...ux.ibm.com>,
Mahanta Jambigi <mjambigi@...ux.ibm.com>,
Tony Lu
<tonylu@...ux.alibaba.com>, Wen Gu <guwen@...ux.alibaba.com>,
netdev@...r.kernel.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
linux-s390@...r.kernel.org, Halil Pasic <pasic@...ux.ibm.com>
Subject: Re: [PATCH net-next 1/2] net/smc: make wr buffer count configurable
On Fri, 5 Sep 2025 11:00:59 +0200
Halil Pasic <pasic@...ux.ibm.com> wrote:
> > 1. What if the two sides have different max_send_wr/max_recv_wr configurations?
> > IIUC, For example, if the client sets max_send_wr to 64, but the server sets
> > max_recv_wr to 16, the client might overflow the server's QP receive
> > queue, potentially causing an RNR (Receiver Not Ready) error.
>
> I don't think the 16 is spec-ed anywhere and if the client and the server
> need to agree on the same value it should either be speced, or a
> protocol mechanism for negotiating it needs to exist. So what is your
> take on this as an SMC maintainer?
>
> I think, we have tested heterogeneous setups and didn't see any grave
> issues. But let me please do a follow up on this. Maybe the other
> maintainers can chime in as well.
Did some research and some thinking. Are you concerned about a
performance regression for e.g. 64 -> 16 compared to 16 -> 16? According
to my current understanding the RNR must not lead to a catastrophic
failure, but the RDMA/IB stack is supposed to handle that.
I would like to also point out that bumping SMC_WR_BUF_CNT basically has
the same problem, although admittedly to a smaller extent because it is
only between "old" and "new".
Assuming that my understanding is correct, I believe that the problem of
the potential RNR is inherent to the objective of the series, and
probably one that can be lived with. Given this entire EID business, I
think the SMC-R setup is likely to happen in a coordinated fashion for
all potential peers, and I hope whoever tweaks those values has a
sufficent understanding or empiric evidence to justify the tweaks.
Assuming my understanding is not utterly wrong, I would very much like
to know what would you want me to do with this?
Thank you in advance!
Regards,
Hali
Powered by blists - more mailing lists