[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<CH8PR12MB97419E98111F553FCC117E36BDE8A@CH8PR12MB9741.namprd12.prod.outlook.com>
Date: Wed, 15 Oct 2025 18:34:33 +0000
From: Sean Hefty <shefty@...dia.com>
To: Jason Gunthorpe <jgg@...pe.ca>, Haakon Bugge <haakon.bugge@...cle.com>
CC: Jacob Moroni <jmoroni@...gle.com>, Leon Romanovsky <leon@...nel.org>, Vlad
Dumitrescu <vdumitrescu@...dia.com>, Or Har-Toov <ohartoov@...dia.com>,
Manjunath Patil <manjunath.b.patil@...cle.com>, OFED mailing list
<linux-rdma@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>
Subject: RE: [PATCH for-next] RDMA/cm: Rate limit destroy CM ID timeout error
message
> > With this hack, running cmtime with 10.000 connections in loopback,
> > the "cm_destroy_id_wait_timeout: cm_id=000000007ce44ace timed out.
> > state 6 -> 0, refcnt=1" messages are indeed produced. Had to kill
> > cmtime because it was hanging, and then it got defunct with the
> > following stack:
>
> Seems like a bug, it should not hang forever if a MAD is lost..
The hack skipped calling ib_post_send. But the result of that is a completion is never written to the CQ. The state machine or reference counting is likely waiting for the completion, so it knows that HW is done trying to access the buffer.
- Sean
Powered by blists - more mailing lists