lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADUfDZqFjjsdWpJ2=NvC5Ny2r7PyLWv4LEREEEk7=RzW-ZosYA@mail.gmail.com>
Date: Tue, 5 Nov 2024 12:39:27 -0800
From: Caleb Sander <csander@...estorage.com>
To: Saeed Mahameed <saeed@...nel.org>
Cc: Saeed Mahameed <saeedm@...dia.com>, Leon Romanovsky <leon@...nel.org>, Tariq Toukan <tariqt@...dia.com>, 
	Andrew Lunn <andrew+netdev@...n.ch>, "David S. Miller" <davem@...emloft.net>, 
	Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, 
	Parav Pandit <parav@...dia.com>, netdev@...r.kernel.org, linux-rdma@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next v3] mlx5/core: Schedule EQ comp tasklet only if necessary

On Tue, Nov 5, 2024 at 10:56 AM Saeed Mahameed <saeed@...nel.org> wrote:
>
> On 31 Oct 10:34, Caleb Sander Mateos wrote:
> >Currently, the mlx5_eq_comp_int() interrupt handler schedules a tasklet
> >to call mlx5_cq_tasklet_cb() if it processes any completions. For CQs
> >whose completions don't need to be processed in tasklet context, this
> >adds unnecessary overhead. In a heavy TCP workload, we see 4% of CPU
> >time spent on the tasklet_trylock() in tasklet_action_common(), with a
> >smaller amount spent on the atomic operations in tasklet_schedule(),
> >tasklet_clear_sched(), and locking the spinlock in mlx5_cq_tasklet_cb().
> >TCP completions are handled by mlx5e_completion_event(), which schedules
> >NAPI to poll the queue, so they don't need tasklet processing.
> >
> >Schedule the tasklet in mlx5_add_cq_to_tasklet() instead to avoid this
> >overhead. mlx5_add_cq_to_tasklet() is responsible for enqueuing the CQs
> >to be processed in tasklet context, so it can schedule the tasklet. CQs
> >that need tasklet processing have their interrupt comp handler set to
> >mlx5_add_cq_to_tasklet(), so they will schedule the tasklet. CQs that
> >don't need tasklet processing won't schedule the tasklet. To avoid
> >scheduling the tasklet multiple times during the same interrupt, only
> >schedule the tasklet in mlx5_add_cq_to_tasklet() if the tasklet work
> >queue was empty before the new CQ was pushed to it.
> >
> >The additional branch in mlx5_add_cq_to_tasklet(), called for each EQE,
> >may add a small cost for the userspace Infiniband CQs whose completions
> >are processed in tasklet context. But this seems worth it to avoid the
> >tasklet overhead for CQs that don't need it.
> >
> >Note that the mlx4 driver works the same way: it schedules the tasklet
> >in mlx4_add_cq_to_tasklet() and only if the work queue was empty before.
> >
> >Signed-off-by: Caleb Sander Mateos <csander@...estorage.com>
> >Reviewed-by: Parav Pandit <parav@...dia.com>
> >---
> >v3: revise commit message
> >v2: reorder variable declarations, describe CPU profile results
> >
> > drivers/net/ethernet/mellanox/mlx5/core/cq.c | 5 +++++
> > drivers/net/ethernet/mellanox/mlx5/core/eq.c | 5 +----
> > 2 files changed, 6 insertions(+), 4 deletions(-)
> >
> >diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cq.c b/drivers/net/ethernet/mellanox/mlx5/core/cq.c
> >index 4caa1b6f40ba..25f3b26db729 100644
> >--- a/drivers/net/ethernet/mellanox/mlx5/core/cq.c
> >+++ b/drivers/net/ethernet/mellanox/mlx5/core/cq.c
> >@@ -69,22 +69,27 @@ void mlx5_cq_tasklet_cb(struct tasklet_struct *t)
> > static void mlx5_add_cq_to_tasklet(struct mlx5_core_cq *cq,
> >                                  struct mlx5_eqe *eqe)
> > {
> >       unsigned long flags;
> >       struct mlx5_eq_tasklet *tasklet_ctx = cq->tasklet_ctx.priv;
> >+      bool schedule_tasklet = false;
> >
> >       spin_lock_irqsave(&tasklet_ctx->lock, flags);
> >       /* When migrating CQs between EQs will be implemented, please note
> >        * that you need to sync this point. It is possible that
> >        * while migrating a CQ, completions on the old EQs could
> >        * still arrive.
> >        */
> >       if (list_empty_careful(&cq->tasklet_ctx.list)) {
> >               mlx5_cq_hold(cq);
>
> The condition here is counter intuitive, please add a comment that relates
> to the tasklet routine mlx5_cq_tasklet_cb, something like.
> /* If this list isn't empty, the tasklet is already scheduled, and not yet
>   * executing the list, the spinlock here guarantees the addition of this CQ
>   * will be seen by the next execution, so rescheduling the tasklet is not
>   * required */

Sure, will send out a v4.

>
> One other way to do this, is to flag tasklet_ctx.sched_flag = true, inside
> mlx5_add_cq_to_tasklet, and then schedule once at the end of eq irq processing
> if (tasklet_ctx.sched_flag == true). to avoid "too" early scheduling, but
> since the tasklet can't run until the irq handler returns, I think your
> solution shouldn't suffer from "too" early scheduling ..

Right, that was my thinking behind the list_empty(&tasklet_ctx->list) check.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ