lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZMkl6HUYMGWXj87P@lincoln>
Date:   Tue, 1 Aug 2023 17:34:00 +0200
From:   Larysa Zaremba <larysa.zaremba@...el.com>
To:     Wei Fang <wei.fang@....com>
CC:     <davem@...emloft.net>, <edumazet@...gle.com>, <kuba@...nel.org>,
        <pabeni@...hat.com>, <shenwei.wang@....com>,
        <xiaoning.wang@....com>, <ast@...nel.org>, <daniel@...earbox.net>,
        <hawk@...nel.org>, <john.fastabend@...il.com>,
        <netdev@...r.kernel.org>, <linux-imx@....com>,
        <linux-kernel@...r.kernel.org>, <bpf@...r.kernel.org>
Subject: Re: [PATCH V3 net-next] net: fec: add XDP_TX feature support

On Mon, Jul 31, 2023 at 02:00:25PM +0800, Wei Fang wrote:
> The XDP_TX feature is not supported before, and all the frames
> which are deemed to do XDP_TX action actually do the XDP_DROP
> action. So this patch adds the XDP_TX support to FEC driver.
> 
> I tested the performance of XDP_TX feature in XDP_DRV and XDP_SKB
> modes on i.MX8MM-EVK and i.MX8MP-EVK platforms respectively, and
> the test steps and results are as follows.
> 
> Step 1: Board A connects to the FEC port of the DUT and runs the
> pktgen_sample03_burst_single_flow.sh script to generate and send
> burst traffic to DUT. Note that the length of packet was set to
> 64 bytes and the procotol of packet was UDP in my test scenario.
> 
> Step 2: The DUT runs the xdp2 program to transmit received UDP
> packets back out on the same port where they were received.
> 
> root@...8mmevk:~# ./xdp2 eth0
> proto 17:     150326 pkt/s
> proto 17:     141920 pkt/s
> proto 17:     147338 pkt/s
> proto 17:     140783 pkt/s
> proto 17:     150400 pkt/s
> proto 17:     134651 pkt/s
> proto 17:     134676 pkt/s
> proto 17:     134959 pkt/s
> proto 17:     148152 pkt/s
> proto 17:     149885 pkt/s
> 
> root@...8mmevk:~# ./xdp2 -S eth0
> proto 17:     131094 pkt/s
> proto 17:     134691 pkt/s
> proto 17:     138930 pkt/s
> proto 17:     129347 pkt/s
> proto 17:     133050 pkt/s
> proto 17:     132932 pkt/s
> proto 17:     136628 pkt/s
> proto 17:     132964 pkt/s
> proto 17:     131265 pkt/s
> proto 17:     135794 pkt/s
> 
> root@...8mpevk:~# ./xdp2 eth0
> proto 17:     135817 pkt/s
> proto 17:     142776 pkt/s
> proto 17:     142237 pkt/s
> proto 17:     135673 pkt/s
> proto 17:     139508 pkt/s
> proto 17:     147340 pkt/s
> proto 17:     133329 pkt/s
> proto 17:     141171 pkt/s
> proto 17:     146917 pkt/s
> proto 17:     135488 pkt/s
> 
> root@...8mpevk:~# ./xdp2 -S eth0
> proto 17:     133150 pkt/s
> proto 17:     133127 pkt/s
> proto 17:     133538 pkt/s
> proto 17:     133094 pkt/s
> proto 17:     133690 pkt/s
> proto 17:     133199 pkt/s
> proto 17:     133905 pkt/s
> proto 17:     132908 pkt/s
> proto 17:     133292 pkt/s
> proto 17:     133511 pkt/s
> 
> Signed-off-by: Wei Fang <wei.fang@....com>

Reviewed-by: Larysa Zaremba <larysa.zaremba@...el.com>

But I thought XDP-related code should go to bpf-next :/
Could somebody clarify this?

> ---
> V2 changes:
> According to Jakub's comments, the V2 patch adds two changes.
> 1. Call txq_trans_cond_update() in fec_enet_xdp_tx_xmit() to avoid
> tx timeout as XDP shares the queues with kernel stack.
> 2. Tx processing shouldn't call any XDP (or page pool) APIs if the
> "budget" is 0.
> 
> V3 changes:
> 1. Remove the second change in V2, because this change has been
> separated into another patch and it has been submmitted to the
> upstream [1].
> [1] https://lore.kernel.org/r/20230725074148.2936402-1-wei.fang@nxp.com
> ---
>  drivers/net/ethernet/freescale/fec.h      |  1 +
>  drivers/net/ethernet/freescale/fec_main.c | 80 ++++++++++++++++++-----
>  2 files changed, 65 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
> index 8f1edcca96c4..f35445bddc7a 100644
> --- a/drivers/net/ethernet/freescale/fec.h
> +++ b/drivers/net/ethernet/freescale/fec.h
> @@ -547,6 +547,7 @@ enum {
>  enum fec_txbuf_type {
>  	FEC_TXBUF_T_SKB,
>  	FEC_TXBUF_T_XDP_NDO,
> +	FEC_TXBUF_T_XDP_TX,
>  };
>  
>  struct fec_tx_buffer {
> diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
> index 14d0dc7ba3c9..2068fe95504e 100644
> --- a/drivers/net/ethernet/freescale/fec_main.c
> +++ b/drivers/net/ethernet/freescale/fec_main.c
> @@ -75,6 +75,8 @@
>  
>  static void set_multicast_list(struct net_device *ndev);
>  static void fec_enet_itr_coal_set(struct net_device *ndev);
> +static int fec_enet_xdp_tx_xmit(struct net_device *ndev,
> +				struct xdp_buff *xdp);
>  
>  #define DRIVER_NAME	"fec"
>  
> @@ -960,7 +962,8 @@ static void fec_enet_bd_init(struct net_device *dev)
>  					txq->tx_buf[i].skb = NULL;
>  				}
>  			} else {
> -				if (bdp->cbd_bufaddr)
> +				if (bdp->cbd_bufaddr &&
> +				    txq->tx_buf[i].type == FEC_TXBUF_T_XDP_NDO)
>  					dma_unmap_single(&fep->pdev->dev,
>  							 fec32_to_cpu(bdp->cbd_bufaddr),
>  							 fec16_to_cpu(bdp->cbd_datlen),
> @@ -1423,7 +1426,8 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id, int budget)
>  				break;
>  
>  			xdpf = txq->tx_buf[index].xdp;
> -			if (bdp->cbd_bufaddr)
> +			if (bdp->cbd_bufaddr &&
> +			    txq->tx_buf[index].type == FEC_TXBUF_T_XDP_NDO)
>  				dma_unmap_single(&fep->pdev->dev,
>  						 fec32_to_cpu(bdp->cbd_bufaddr),
>  						 fec16_to_cpu(bdp->cbd_datlen),
> @@ -1482,7 +1486,7 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id, int budget)
>  			/* Free the sk buffer associated with this last transmit */
>  			dev_kfree_skb_any(skb);
>  		} else {
> -			xdp_return_frame(xdpf);
> +			xdp_return_frame_rx_napi(xdpf);
>  
>  			txq->tx_buf[index].xdp = NULL;
>  			/* restore default tx buffer type: FEC_TXBUF_T_SKB */
> @@ -1573,11 +1577,18 @@ fec_enet_run_xdp(struct fec_enet_private *fep, struct bpf_prog *prog,
>  		}
>  		break;
>  
> -	default:
> -		bpf_warn_invalid_xdp_action(fep->netdev, prog, act);
> -		fallthrough;
> -
>  	case XDP_TX:
> +		err = fec_enet_xdp_tx_xmit(fep->netdev, xdp);
> +		if (err) {
> +			ret = FEC_ENET_XDP_CONSUMED;
> +			page = virt_to_head_page(xdp->data);
> +			page_pool_put_page(rxq->page_pool, page, sync, true);
> +		} else {
> +			ret = FEC_ENET_XDP_TX;
> +		}
> +		break;
> +
> +	default:
>  		bpf_warn_invalid_xdp_action(fep->netdev, prog, act);
>  		fallthrough;
>  
> @@ -3793,7 +3804,8 @@ fec_enet_xdp_get_tx_queue(struct fec_enet_private *fep, int index)
>  
>  static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep,
>  				   struct fec_enet_priv_tx_q *txq,
> -				   struct xdp_frame *frame)
> +				   struct xdp_frame *frame,
> +				   bool ndo_xmit)
>  {
>  	unsigned int index, status, estatus;
>  	struct bufdesc *bdp;
> @@ -3813,10 +3825,24 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep,
>  
>  	index = fec_enet_get_bd_index(bdp, &txq->bd);
>  
> -	dma_addr = dma_map_single(&fep->pdev->dev, frame->data,
> -				  frame->len, DMA_TO_DEVICE);
> -	if (dma_mapping_error(&fep->pdev->dev, dma_addr))
> -		return -ENOMEM;
> +	if (ndo_xmit) {
> +		dma_addr = dma_map_single(&fep->pdev->dev, frame->data,
> +					  frame->len, DMA_TO_DEVICE);
> +		if (dma_mapping_error(&fep->pdev->dev, dma_addr))
> +			return -ENOMEM;
> +
> +		txq->tx_buf[index].type = FEC_TXBUF_T_XDP_NDO;
> +	} else {
> +		struct page *page = virt_to_page(frame->data);
> +
> +		dma_addr = page_pool_get_dma_addr(page) + sizeof(*frame) +
> +			   frame->headroom;
> +		dma_sync_single_for_device(&fep->pdev->dev, dma_addr,
> +					   frame->len, DMA_BIDIRECTIONAL);
> +		txq->tx_buf[index].type = FEC_TXBUF_T_XDP_TX;
> +	}
> +
> +	txq->tx_buf[index].xdp = frame;
>  
>  	status |= (BD_ENET_TX_INTR | BD_ENET_TX_LAST);
>  	if (fep->bufdesc_ex)
> @@ -3835,9 +3861,6 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep,
>  		ebdp->cbd_esc = cpu_to_fec32(estatus);
>  	}
>  
> -	txq->tx_buf[index].type = FEC_TXBUF_T_XDP_NDO;
> -	txq->tx_buf[index].xdp = frame;
> -
>  	/* Make sure the updates to rest of the descriptor are performed before
>  	 * transferring ownership.
>  	 */
> @@ -3863,6 +3886,31 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep,
>  	return 0;
>  }
>  
> +static int fec_enet_xdp_tx_xmit(struct net_device *ndev,
> +				struct xdp_buff *xdp)
> +{
> +	struct xdp_frame *xdpf = xdp_convert_buff_to_frame(xdp);
> +	struct fec_enet_private *fep = netdev_priv(ndev);
> +	struct fec_enet_priv_tx_q *txq;
> +	int cpu = smp_processor_id();
> +	struct netdev_queue *nq;
> +	int queue, ret;
> +
> +	queue = fec_enet_xdp_get_tx_queue(fep, cpu);
> +	txq = fep->tx_queue[queue];
> +	nq = netdev_get_tx_queue(fep->netdev, queue);
> +
> +	__netif_tx_lock(nq, cpu);
> +
> +	/* Avoid tx timeout as XDP shares the queue with kernel stack */
> +	txq_trans_cond_update(nq);
> +	ret = fec_enet_txq_xmit_frame(fep, txq, xdpf, false);
> +
> +	__netif_tx_unlock(nq);
> +
> +	return ret;
> +}
> +
>  static int fec_enet_xdp_xmit(struct net_device *dev,
>  			     int num_frames,
>  			     struct xdp_frame **frames,
> @@ -3885,7 +3933,7 @@ static int fec_enet_xdp_xmit(struct net_device *dev,
>  	/* Avoid tx timeout as XDP shares the queue with kernel stack */
>  	txq_trans_cond_update(nq);
>  	for (i = 0; i < num_frames; i++) {
> -		if (fec_enet_txq_xmit_frame(fep, txq, frames[i]) < 0)
> +		if (fec_enet_txq_xmit_frame(fep, txq, frames[i], true) < 0)
>  			break;
>  		sent_frames++;
>  	}
> -- 
> 2.25.1
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ