[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250925195640.32594-1-philipp.reisner@linbit.com>
Date: Thu, 25 Sep 2025 21:56:40 +0200
From: Philipp Reisner <philipp.reisner@...bit.com>
To: Zhu Yanjun <yanjun.zhu@...ux.dev>
Cc: Jason Gunthorpe <jgg@...pe.ca>,
Leon Romanovsky <leon@...nel.org>,
linux-rdma@...r.kernel.org,
linux-kernel@...r.kernel.org,
Philipp Reisner <philipp.reisner@...bit.com>
Subject: [PATCH V3] rdma_rxe: call comp_handler without holding cq->cq_lock
Allow the comp_handler callback implementation to call ib_poll_cq().
A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe driver.
And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock deadlock.
The Mellanox and Intel drivers allow a comp_handler callback
implementation to call ib_poll_cq().
Avoid the deadlock by calling the comp_handler callback without
holding cq->cq_lock.
Other InfiniBand drivers call the comp_handler callback from a single
thread, in the RXE driver, acquiring the cq->cq_lock has achieved that
up to now. As that gets removed, introduce a new lock dedicated to
making the execution of the comp_handler single-threaded.
Changelog:
v2 -> v3:
- make execution of comp_handler single-threaded
v2: https://lore.kernel.org/lkml/20250822081941.989520-1-philipp.reisner@linbit.com/
v1 -> v2:
- Only reset cq->notify to 0 when invoking the comp_handler
v1: https://lore.kernel.org/all/20250806123921.633410-1-philipp.reisner@linbit.com/
====================
Signed-off-by: Philipp Reisner <philipp.reisner@...bit.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@...ux.dev>
---
drivers/infiniband/sw/rxe/rxe_cq.c | 10 +++++++++-
drivers/infiniband/sw/rxe/rxe_verbs.h | 1 +
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c
index fffd144d509e..8d94cef7bd50 100644
--- a/drivers/infiniband/sw/rxe/rxe_cq.c
+++ b/drivers/infiniband/sw/rxe/rxe_cq.c
@@ -62,6 +62,7 @@ int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe,
cq->is_user = uresp;
spin_lock_init(&cq->cq_lock);
+ spin_lock_init(&cq->comp_handler_lock);
cq->ibcq.cqe = cqe;
return 0;
}
@@ -88,6 +89,7 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
int full;
void *addr;
unsigned long flags;
+ bool invoke_handler = false;
spin_lock_irqsave(&cq->cq_lock, flags);
@@ -113,11 +115,17 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
if ((cq->notify & IB_CQ_NEXT_COMP) ||
(cq->notify & IB_CQ_SOLICITED && solicited)) {
cq->notify = 0;
- cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
+ invoke_handler = true;
}
spin_unlock_irqrestore(&cq->cq_lock, flags);
+ if (invoke_handler) {
+ spin_lock_irqsave(&cq->comp_handler_lock, flags);
+ cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
+ spin_unlock_irqrestore(&cq->comp_handler_lock, flags);
+ }
+
return 0;
}
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index fd48075810dd..04ec60a786f8 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -62,6 +62,7 @@ struct rxe_cq {
struct rxe_pool_elem elem;
struct rxe_queue *queue;
spinlock_t cq_lock;
+ spinlock_t comp_handler_lock;
u8 notify;
bool is_user;
atomic_t num_wq;
--
2.50.1
Powered by blists - more mailing lists