lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201029131557.GD15697@lore-desk>
Date:   Thu, 29 Oct 2020 14:15:57 +0100
From:   Lorenzo Bianconi <lorenzo.bianconi@...hat.com>
To:     Shay Agroskin <shayagr@...zon.com>
Cc:     Lorenzo Bianconi <lorenzo@...nel.org>, netdev@...r.kernel.org,
        bpf@...r.kernel.org, davem@...emloft.net, kuba@...nel.org,
        brouer@...hat.com, ilias.apalodimas@...aro.org
Subject: Re: [PATCH net-next 4/4] net: mlx5: add xdp tx return bulking support

> 
> Lorenzo Bianconi <lorenzo@...nel.org> writes:
> 
> > Convert mlx5 driver to xdp_return_frame_bulk APIs.
> > 
> > XDP_REDIRECT (upstream codepath): 8.5Mpps
> > XDP_REDIRECT (upstream codepath + bulking APIs): 10.1Mpps
> > 
> > Signed-off-by: Lorenzo Bianconi <lorenzo@...nel.org>
> > ---
> >  drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
> > b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
> > index ae90d533a350..5fdfbf390d5c 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
> > @@ -369,8 +369,10 @@ static void mlx5e_free_xdpsq_desc(struct
> > mlx5e_xdpsq *sq,
> >  				  bool recycle)
> >  {
> >  	struct mlx5e_xdp_info_fifo *xdpi_fifo = &sq->db.xdpi_fifo;
> > +	struct xdp_frame_bulk bq;
> >  	u16 i;
> > +	bq.xa = NULL;
> >  	for (i = 0; i < wi->num_pkts; i++) {
> >  		struct mlx5e_xdp_info xdpi =  mlx5e_xdpi_fifo_pop(xdpi_fifo);
> >   @@ -379,7 +381,7 @@ static void mlx5e_free_xdpsq_desc(struct
> > mlx5e_xdpsq *sq,
> >  			/* XDP_TX from the XSK RQ and XDP_REDIRECT  */
> >  			dma_unmap_single(sq->pdev,  xdpi.frame.dma_addr,
> >  					 xdpi.frame.xdpf->len,  DMA_TO_DEVICE);
> > -			xdp_return_frame(xdpi.frame.xdpf);
> > +			xdp_return_frame_bulk(xdpi.frame.xdpf, &bq);
> >  			break;
> >  		case MLX5E_XDP_XMIT_MODE_PAGE:
> >  			/* XDP_TX from the regular RQ */
> > @@ -393,6 +395,7 @@ static void mlx5e_free_xdpsq_desc(struct mlx5e_xdpsq
> > *sq,
> >  			WARN_ON_ONCE(true);
> >  		}
> >  	}
> > +	xdp_flush_frame_bulk(&bq);
> 
> While I understand the rational behind this patchset, using an intermediate
> buffer
> 	void *q[XDP_BULK_QUEUE_SIZE];
> means more pressure on the data cache.
> 
> At the time I ran performance tests on mlx5 to see whether batching skbs
> before passing them to GRO would improve performance. On some flows I got
> worse performance.
> This function seems to have less Dcache contention than RX flow, but maybe
> some performance testing are needed here.

Hi Shay,

this codepath is only activated for "redirected" frames (not for packets
forwarded to networking stack). We run performance comparisons with the
upstream code for this particular use-case and we reported a nice
improvements (8.5Mpps vs 10.1Mpps).
Do you have in mind other possible performance tests to run?

Regards,
Lorenzo

> 
> >  }
> >  bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq)
> 

Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ