[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADUfDZqanDo+v_jap7pQire86QkfaDQE4HvhvVBb64YqKNgRHg@mail.gmail.com>
Date: Tue, 5 Nov 2024 08:06:25 -0800
From: Caleb Sander <csander@...estorage.com>
To: Parav Pandit <parav@...dia.com>
Cc: Saeed Mahameed <saeedm@...dia.com>, Leon Romanovsky <leon@...nel.org>, Tariq Toukan <tariqt@...dia.com>,
Andrew Lunn <andrew+netdev@...n.ch>, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net-next 2/2] mlx5/core: deduplicate {mlx5_,}eq_update_ci()
On Mon, Nov 4, 2024 at 9:22 PM Parav Pandit <parav@...dia.com> wrote:
>
>
>
> > From: Caleb Sander <csander@...estorage.com>
> > Sent: Monday, November 4, 2024 3:49 AM
> >
> > On Sat, Nov 2, 2024 at 8:55 PM Parav Pandit <parav@...dia.com> wrote:
> > >
> > >
> > >
> > > > From: Caleb Sander Mateos <csander@...estorage.com>
> > > > Sent: Friday, November 1, 2024 9:17 AM
> > > >
> > > > The logic of eq_update_ci() is duplicated in mlx5_eq_update_ci().
> > > > The only additional work done by mlx5_eq_update_ci() is to increment
> > > > eq->cons_index. Call eq_update_ci() from mlx5_eq_update_ci() to
> > > > eq->avoid
> > > > the duplication.
> > > >
> > > > Signed-off-by: Caleb Sander Mateos <csander@...estorage.com>
> > > > ---
> > > > drivers/net/ethernet/mellanox/mlx5/core/eq.c | 9 +--------
> > > > 1 file changed, 1 insertion(+), 8 deletions(-)
> > > >
> > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > > > b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > > > index 859dcf09b770..078029c81935 100644
> > > > --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > > > @@ -802,19 +802,12 @@ struct mlx5_eqe *mlx5_eq_get_eqe(struct
> > > > mlx5_eq *eq, u32 cc) } EXPORT_SYMBOL(mlx5_eq_get_eqe);
> > > >
> > > > void mlx5_eq_update_ci(struct mlx5_eq *eq, u32 cc, bool arm) {
> > > > - __be32 __iomem *addr = eq->doorbell + (arm ? 0 : 2);
> > > > - u32 val;
> > > > -
> > > > eq->cons_index += cc;
> > > > - val = (eq->cons_index & 0xffffff) | (eq->eqn << 24);
> > > > -
> > > > - __raw_writel((__force u32)cpu_to_be32(val), addr);
> > > > - /* We still want ordering, just not swabbing, so add a barrier */
> > > > - wmb();
> > > > + eq_update_ci(eq, arm);
> > > Long ago I had similar rework patches to get rid of __raw_writel(),
> > > which I never got chance to push,
> > >
> > > Eq_update_ci() is using full memory barrier.
> > > While mlx5_eq_update_ci() is using only write memory barrier.
> > >
> > > So it is not 100% deduplication by this patch.
> > > Please have a pre-patch improving eq_update_ci() to use wmb().
> > > Followed by this patch.
> >
> > Right, patch 1/2 in this series is changing eq_update_ci() to use
> > writel() instead of __raw_writel() and avoid the memory barrier:
> > https://lore.kernel.org/lkml/20241101034647.51590-1-
> > csander@...estorage.com/
> This patch has two bugs.
> 1. writel() writes the MMIO space in LE order. EQ updates are in BE order.
> So this will break on ppc64 BE.
Okay, so this should be writel(cpu_to_le32(val), addr)?
>
> 2. writel() issues the barrier BEFORE the raw_writel().
> As opposed to that eq update needs to have a barrier AFTER the writel().
> Likely to synchronize with other CQ related pointers update.
I was referencing this prior discussion about the memory barrier:
https://lore.kernel.org/netdev/CALzJLG8af0SMfA1C8U8r_Fddb_ZQhvEZd6=2a97dOoBcgLA0xg@mail.gmail.com/
>From Saeed's message, it sounds like the memory barrier is only used
to ensure the ordering of writes to the doorbell register, not the
ordering of the doorbell write relative to any other writes. If some
other write needs to be ordered after the doorbell write, please
explain what it is. As Gal Pressman pointed out, a wmb() at the end of
a function doesn't make much sense, as there are no further writes in
the function to order. If the doorbell write needs to be ordered
before some other write in a caller function, the memory barrier
should probably move to the caller.
>
> > Are you suggesting something different? If so, it would be great if you could
> > clarify what you mean.
> >
> So I was suggesting to keep __raw_writel() as is and replace mb() with wmb().
wmb() would certainly be cheaper than mb(), but I would like to
understand the requirement for the barrier in the first place. The
fence instruction is very expensive.
Thanks,
Caleb
Powered by blists - more mailing lists