lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADUfDZq0E-GJZxFD4gR7qqpHqcQ2d4cy-Duz7SYMpOZTRvOcKA@mail.gmail.com>
Date: Wed, 6 Nov 2024 20:45:20 -0800
From: Caleb Sander <csander@...estorage.com>
To: Saeed Mahameed <saeed@...nel.org>
Cc: Parav Pandit <parav@...dia.com>, Saeed Mahameed <saeedm@...dia.com>, 
	Leon Romanovsky <leon@...nel.org>, Tariq Toukan <tariqt@...dia.com>, Andrew Lunn <andrew+netdev@...n.ch>, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, 
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>, 
	"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net-next 2/2] mlx5/core: deduplicate {mlx5_,}eq_update_ci()

On Wed, Nov 6, 2024 at 6:36 PM Saeed Mahameed <saeed@...nel.org> wrote:
>
> On 06 Nov 15:44, Caleb Sander wrote:
> >On Tue, Nov 5, 2024 at 9:44 PM Parav Pandit <parav@...dia.com> wrote:
> >>
> >>
> >> > From: Caleb Sander <csander@...estorage.com>
> >> > Sent: Tuesday, November 5, 2024 9:36 PM
> >> >
> >> > On Mon, Nov 4, 2024 at 9:22 PM Parav Pandit <parav@...dia.com> wrote:
> >> > >
> >> > >
> >> > >
> >> > > > From: Caleb Sander <csander@...estorage.com>
> >> > > > Sent: Monday, November 4, 2024 3:49 AM
> >> > > >
> >> > > > On Sat, Nov 2, 2024 at 8:55 PM Parav Pandit <parav@...dia.com> wrote:
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > > From: Caleb Sander Mateos <csander@...estorage.com>
> >> > > > > > Sent: Friday, November 1, 2024 9:17 AM
> >> > > > > >
> >> > > > > > The logic of eq_update_ci() is duplicated in mlx5_eq_update_ci().
> >> > > > > > The only additional work done by mlx5_eq_update_ci() is to
> >> > > > > > increment
> >> > > > > > eq->cons_index. Call eq_update_ci() from mlx5_eq_update_ci() to
> >> > > > > > eq->avoid
> >> > > > > > the duplication.
> >> > > > > >
> >> > > > > > Signed-off-by: Caleb Sander Mateos <csander@...estorage.com>
> >> > > > > > ---
> >> > > > > >  drivers/net/ethernet/mellanox/mlx5/core/eq.c | 9 +--------
> >> > > > > >  1 file changed, 1 insertion(+), 8 deletions(-)
> >> > > > > >
> >> > > > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> >> > > > > > b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> >> > > > > > index 859dcf09b770..078029c81935 100644
> >> > > > > > --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> >> > > > > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> >> > > > > > @@ -802,19 +802,12 @@ struct mlx5_eqe *mlx5_eq_get_eqe(struct
> >> > > > > > mlx5_eq *eq, u32 cc)  }  EXPORT_SYMBOL(mlx5_eq_get_eqe);
> >> > > > > >
> >> > > > > >  void mlx5_eq_update_ci(struct mlx5_eq *eq, u32 cc, bool arm)  {
> >> > > > > > -     __be32 __iomem *addr = eq->doorbell + (arm ? 0 : 2);
> >> > > > > > -     u32 val;
> >> > > > > > -
> >> > > > > >       eq->cons_index += cc;
> >> > > > > > -     val = (eq->cons_index & 0xffffff) | (eq->eqn << 24);
> >> > > > > > -
> >> > > > > > -     __raw_writel((__force u32)cpu_to_be32(val), addr);
> >> > > > > > -     /* We still want ordering, just not swabbing, so add a barrier */
> >> > > > > > -     wmb();
> >> > > > > > +     eq_update_ci(eq, arm);
> >> > > > > Long ago I had similar rework patches to get rid of
> >> > > > > __raw_writel(), which I never got chance to push,
> >> > > > >
> >> > > > > Eq_update_ci() is using full memory barrier.
> >> > > > > While mlx5_eq_update_ci() is using only write memory barrier.
> >> > > > >
> >> > > > > So it is not 100% deduplication by this patch.
> >> > > > > Please have a pre-patch improving eq_update_ci() to use wmb().
> >> > > > > Followed by this patch.
> >> > > >
> >> > > > Right, patch 1/2 in this series is changing eq_update_ci() to use
> >> > > > writel() instead of __raw_writel() and avoid the memory barrier:
> >> > > > https://lore.kernel.org/lkml/20241101034647.51590-1-
> >> > > > csander@...estorage.com/
> >> > > This patch has two bugs.
> >> > > 1. writel() writes the MMIO space in LE order. EQ updates are in BE order.
> >> > > So this will break on ppc64 BE.
> >> >
> >> > Okay, so this should be writel(cpu_to_le32(val), addr)?
> >> >
> >> That would break the x86 side because device should receive in BE format regardless of cpu endianness.
> >> Above code will write in the LE format.
> >>
> >> So an API foo_writel() need which does
> >> a. write memory barrier
> >> b. write to MMIO space but without endineness conversion.
> >
> >Got it, thanks. writel(bswap_32(val, addr)) should work, then? I
> >suppose it may introduce a second bswap on BE architectures, but
> >that's probably worth it to avoid the memory barrier.
> >
>
> The existing mb() needs to be changed to wmb(), this will provide a more
> efficient fence on most architectures.
>
> I don't understand why you are still discussing the use of writel(), yes
> it will work but you are introducing two unconditional swaps per doorbell
> write.

Well, no memory fence is cheaper still than a wmb(). But it's your
driver, so if you prefer to use wmb() rather than switch to writel(),
that's fine. I'll update the patch series.
As for the bytes swaps in writel(bswap_32(val), addr), it would still
be 1 on LE architectures, but 2 instead of 0 on BE architectures.
Certainly a bit inefficient, but probably less overhead than the
memory barrier currently adds on strongly-ordered architectures.

>
> Just replace the existing mb with wmb() in eq_update_ci()
>
> And if you have time to write one extra patch, please reuse eq_update_ci()
> inside mlx5_eq_update_ci().
>
> mlx5_eq_update_ci(eq, cc, arm) {
>         eq->cons_index += cc;
>         eq_update_ci(eq, arm);
> }
>
> So we won't have two different implementations of EQ doorbell ringing
> anymore.

Isn't this what my patch 2 (at the start of this reply chain) already
does? If you are suggesting something else, please clarify.

Thanks for the reviews,
Caleb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ