[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e88ac12b80b510648f7ab1d4cee50c43908ba49d.camel@mellanox.com>
Date: Wed, 6 Feb 2019 16:55:24 +0000
From: Saeed Mahameed <saeedm@...lanox.com>
To: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Tariq Toukan <tariqt@...lanox.com>,
"xiyou.wangcong@...il.com" <xiyou.wangcong@...il.com>
Subject: Re: [Patch net-next] mlx5: use RCU lock in mlx5_eq_cq_get()
On Wed, 2019-02-06 at 12:02 +0000, Tariq Toukan wrote:
>
> On 2/6/2019 2:35 AM, Cong Wang wrote:
> > mlx5_eq_cq_get() is called in IRQ handler, the spinlock inside
> > gets a lot of contentions when we test some heavy workload
> > with 60 RX queues and 80 CPU's, and it is clearly shown in the
> > flame graph.
> >
Hi Cong,
The patch is ok to me, but i really doubt that you can hit a contention
on latest upstream driver, since we already have spinlock per EQ, which
means spinlock per core, each EQ (core) msix handler can only access
one spinlock (its own), so I am surprised how you got the contention,
Maybe you are not running on latest upstream driver ?
what is the workload ?
> > In fact, radix_tree_lookup() is perfectly fine with RCU read lock,
> > we don't have to take a spinlock on this hot path. It is pretty
> > much
> > similar to commit 291c566a2891
> > ("net/mlx4_core: Fix racy CQ (Completion Queue) free"). Slow paths
> > are still serialized with the spinlock, and with synchronize_irq()
> > it should be safe to just move the fast path to RCU read lock.
> >
> > This patch itself reduces the latency by about 50% with our
> > workload.
> >
> > Cc: Saeed Mahameed <saeedm@...lanox.com>
> > Cc: Tariq Toukan <tariqt@...lanox.com>
> > Signed-off-by: Cong Wang <xiyou.wangcong@...il.com>
> > ---
> > drivers/net/ethernet/mellanox/mlx5/core/eq.c | 12 ++++++------
> > 1 file changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > index ee04aab65a9f..7092457705a2 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > @@ -114,11 +114,11 @@ static struct mlx5_core_cq
> > *mlx5_eq_cq_get(struct mlx5_eq *eq, u32 cqn)
> > struct mlx5_cq_table *table = &eq->cq_table;
> > struct mlx5_core_cq *cq = NULL;
> >
> > - spin_lock(&table->lock);
> > + rcu_read_lock();
> > cq = radix_tree_lookup(&table->tree, cqn);
> > if (likely(cq))
> > mlx5_cq_hold(cq);
> > - spin_unlock(&table->lock);
> > + rcu_read_unlock();
>
> Thanks for you patch.
>
> I think we can improve it further, by taking the if statement out of
> the
> critical section.
>
No, mlx5_cq_hold must stay under RCU read, otherwise cq might get freed
before the irq gets a change to increment ref count on it.
another way to do it is not to do any refcounting in the irq handler
and fence cq removal via synchronize_irq(eq->irqn) on mlx5_eq_del_cq.
But let's keep one approach (refcounting), synchronize_irq/rcu can be
heavy sometimes especially on RDMA workloads with many create/destroy
cq in loops.
> Other than that, patch LGTM.
>
> Regards,
> Tariq
>
> >
> > return cq;
> > }
> > @@ -371,9 +371,9 @@ int mlx5_eq_add_cq(struct mlx5_eq *eq, struct
> > mlx5_core_cq *cq)
> > struct mlx5_cq_table *table = &eq->cq_table;
> > int err;
> >
> > - spin_lock_irq(&table->lock);
> > + spin_lock(&table->lock);
> > err = radix_tree_insert(&table->tree, cq->cqn, cq);
> > - spin_unlock_irq(&table->lock);
> > + spin_unlock(&table->lock);
> >
> > return err;
> > }
> > @@ -383,9 +383,9 @@ int mlx5_eq_del_cq(struct mlx5_eq *eq, struct
> > mlx5_core_cq *cq)
> > struct mlx5_cq_table *table = &eq->cq_table;
> > struct mlx5_core_cq *tmp;
> >
> > - spin_lock_irq(&table->lock);
> > + spin_lock(&table->lock);
> > tmp = radix_tree_delete(&table->tree, cq->cqn);
> > - spin_unlock_irq(&table->lock);
> > + spin_unlock(&table->lock);
> >
> > if (!tmp) {
> > mlx5_core_warn(eq->dev, "cq 0x%x not found in eq 0x%x
> > tree\n", eq->eqn, cq->cqn);
> >
Powered by blists - more mailing lists