[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250925170524.7adc1aa3.pasic@linux.ibm.com>
Date: Thu, 25 Sep 2025 17:05:24 +0200
From: Halil Pasic <pasic@...ux.ibm.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: Jakub Kicinski <kuba@...nel.org>, Simon Horman <horms@...nel.org>,
"D.
Wythe" <alibuda@...ux.alibaba.com>,
Dust Li <dust.li@...ux.alibaba.com>,
Sidraya Jayagond <sidraya@...ux.ibm.com>,
Wenjia Zhang
<wenjia@...ux.ibm.com>,
Mahanta Jambigi <mjambigi@...ux.ibm.com>,
Tony Lu
<tonylu@...ux.alibaba.com>, Wen Gu <guwen@...ux.alibaba.com>,
netdev@...r.kernel.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
linux-s390@...r.kernel.org, Halil Pasic <pasic@...ux.ibm.com>
Subject: Re: [PATCH net-next v3 2/2] net/smc: handle -ENOMEM from
smc_wr_alloc_link_mem gracefully
On Thu, 25 Sep 2025 11:40:40 +0200
Paolo Abeni <pabeni@...hat.com> wrote:
> > + do {
> > + rc = smc_ib_create_queue_pair(lnk);
> > + if (rc)
> > + goto dealloc_pd;
> > + rc = smc_wr_alloc_link_mem(lnk);
> > + if (!rc)
> > + break;
> > + else if (rc != -ENOMEM) /* give up */
> > + goto destroy_qp;
> > + /* retry with smaller ... */
> > + lnk->max_send_wr /= 2;
> > + lnk->max_recv_wr /= 2;
> > + /* ... unless droping below old SMC_WR_BUF_SIZE */
> > + if (lnk->max_send_wr < 16 || lnk->max_recv_wr < 48)
> > + goto destroy_qp;
>
> If i.e. smc.sysctl_smcr_max_recv_wr == 2048, and
> smc.sysctl_smcr_max_send_wr == 16, the above loop can give-up a little
> too early - after the first failure. What about changing the termination
> condition to:
>
> lnk->max_send_wr < 16 && lnk->max_recv_wr < 48
>
> and use 2 as a lower bound for both lnk->max_send_wr and lnk->max_recv_wr?
My intention was to preserve the ratio (max_recv_wr/max_send_wr) because
I assume that the optimal ratio is workload dependent, and that scaling
both down at the same rate is easy to understand. And also to never dip
below the old values to avoid regressions due to even less WR buffers
than before the change.
I get your point, but as long as the ratio is kept I think the problem,
if considered a problem is there to stay. For example for
smc.sysctl_smcr_max_recv_wr == 2048 and smc.sysctl_smcr_max_send_wr == 2
we would still give up after the first failure even with 2 as a lower
bound.
Let me also state that in my opinion giving up isn't that bad, because
SMC-R is supposed to be an optimization, and we still have the TCP
fallback. If we end up much worse than TCP because of back-off going
overboard, that is probably worse than just giving up on SMC-R and
going with TCP.
On the other hand, making the ratio change would make things more
complicated, less predictable, and also possibly take more iterations.
For example smc.sysctl_smcr_max_recv_wr == 2048 and
smc.sysctl_smcr_max_send_wr == 2000.
So I would prefer sticking to the current logic.
Regards,
Halil
Powered by blists - more mailing lists