lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZdNF0ArChUkGzk42@boxer>
Date: Mon, 19 Feb 2024 13:13:04 +0100
From: Maciej Fijalkowski <maciej.fijalkowski@...el.com>
To: Florian Kauer <florian.kauer@...utronix.de>
CC: Jesse Brandeburg <jesse.brandeburg@...el.com>, Tony Nguyen
	<anthony.l.nguyen@...el.com>, "David S. Miller" <davem@...emloft.net>, "Eric
 Dumazet" <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
	<pabeni@...hat.com>, Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann
	<daniel@...earbox.net>, Jesper Dangaard Brouer <hawk@...nel.org>, "John
 Fastabend" <john.fastabend@...il.com>, Jithu Joseph <jithu.joseph@...el.com>,
	Andre Guedes <andre.guedes@...el.com>, Vedang Patel <vedang.patel@...el.com>,
	<kurt@...utronix.de>, <intel-wired-lan@...ts.osuosl.org>,
	<netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<bpf@...r.kernel.org>
Subject: Re: [PATCH net 1/1] igc: avoid returning frame twice in XDP_REDIRECT

On Mon, Feb 19, 2024 at 10:08:43AM +0100, Florian Kauer wrote:
> When a frame can not be transmitted in XDP_REDIRECT
> (e.g. due to a full queue), it is necessary to free
> it by calling xdp_return_frame_rx_napi.
> 
> However, this is the reponsibility of the caller of
> the ndo_xdp_xmit (see for example bq_xmit_all in
> kernel/bpf/devmap.c) and thus calling it inside
> igc_xdp_xmit (which is the ndo_xdp_xmit of the igc
> driver) as well will lead to memory corruption.
> 
> In fact, bq_xmit_all expects that it can return all
> frames after the last successfully transmitted one.
> Therefore, break for the first not transmitted frame,
> but do not call xdp_return_frame_rx_napi in igc_xdp_xmit.
> This is equally implemented in other Intel drivers
> such as the igb.
> 
> There are two alternatives to this that were rejected:
> 1. Return num_frames as all the frames would have been
>    transmitted and release them inside igc_xdp_xmit.
>    While it might work technically, it is not what
>    the return value is meant to repesent (i.e. the
>    number of SUCCESSFULLY transmitted packets).
> 2. Rework kernel/bpf/devmap.c and all drivers to
>    support non-consecutively dropped packets.
>    Besides being complex, it likely has a negative
>    performance impact without a significant gain
>    since it is anyway unlikely that the next frame
>    can be transmitted if the previous one was dropped.
> 
> The memory corruption can be reproduced with
> the following script which leads to a kernel panic
> after a few seconds.  It basically generates more
> traffic than a i225 NIC can transmit and pushes it
> via XDP_REDIRECT from a virtual interface to the
> physical interface where frames get dropped.
> 
>    #!/bin/bash
>    INTERFACE=enp4s0
>    INTERFACE_IDX=`cat /sys/class/net/$INTERFACE/ifindex`
> 
>    sudo ip link add dev veth1 type veth peer name veth2
>    sudo ip link set up $INTERFACE
>    sudo ip link set up veth1
>    sudo ip link set up veth2
> 
>    cat << EOF > redirect.bpf.c
> 
>    SEC("prog")
>    int redirect(struct xdp_md *ctx)
>    {
>        return bpf_redirect($INTERFACE_IDX, 0);
>    }
> 
>    char _license[] SEC("license") = "GPL";
>    EOF
>    clang -O2 -g -Wall -target bpf -c redirect.bpf.c -o redirect.bpf.o
>    sudo ip link set veth2 xdp obj redirect.bpf.o
> 
>    cat << EOF > pass.bpf.c
> 
>    SEC("prog")
>    int pass(struct xdp_md *ctx)
>    {
>        return XDP_PASS;
>    }
> 
>    char _license[] SEC("license") = "GPL";
>    EOF
>    clang -O2 -g -Wall -target bpf -c pass.bpf.c -o pass.bpf.o
>    sudo ip link set $INTERFACE xdp obj pass.bpf.o
> 
>    cat << EOF > trafgen.cfg
> 
>    {
>      /* Ethernet Header */
>      0xe8, 0x6a, 0x64, 0x41, 0xbf, 0x46,
>      0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
>      const16(ETH_P_IP),
> 
>      /* IPv4 Header */
>      0b01000101, 0,   # IPv4 version, IHL, TOS
>      const16(1028),   # IPv4 total length (UDP length + 20 bytes (IP header))
>      const16(2),      # IPv4 ident
>      0b01000000, 0,   # IPv4 flags, fragmentation off
>      64,              # IPv4 TTL
>      17,              # Protocol UDP
>      csumip(14, 33),  # IPv4 checksum
> 
>      /* UDP Header */
>      10,  0, 1, 1,    # IP Src - adapt as needed
>      10,  0, 1, 2,    # IP Dest - adapt as needed
>      const16(6666),   # UDP Src Port
>      const16(6666),   # UDP Dest Port
>      const16(1008),   # UDP length (UDP header 8 bytes + payload length)
>      csumudp(14, 34), # UDP checksum
> 
>      /* Payload */
>      fill('W', 1000),
>    }
>    EOF
> 
>    sudo trafgen -i trafgen.cfg -b3000MB -o veth1 --cpp
> 
> Fixes: 4ff320361092 ("igc: Add support for XDP_REDIRECT action")
> Signed-off-by: Florian Kauer <florian.kauer@...utronix.de>

Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@...el.com>

Apparently fdc13979f91e ("bpf, devmap: Move drop error path to devmap for
XDP_REDIRECT") addressed all ndo_xdp_xmit callbacks but igc support was
added shortly after that...?

> ---
>  drivers/net/ethernet/intel/igc/igc_main.c | 13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
> index ba8d3fe186ae..81c21a893ede 100644
> --- a/drivers/net/ethernet/intel/igc/igc_main.c
> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
> @@ -6487,7 +6487,7 @@ static int igc_xdp_xmit(struct net_device *dev, int num_frames,
>  	int cpu = smp_processor_id();
>  	struct netdev_queue *nq;
>  	struct igc_ring *ring;
> -	int i, drops;
> +	int i, nxmit;
>  
>  	if (unlikely(!netif_carrier_ok(dev)))
>  		return -ENETDOWN;
> @@ -6503,16 +6503,15 @@ static int igc_xdp_xmit(struct net_device *dev, int num_frames,
>  	/* Avoid transmit queue timeout since we share it with the slow path */
>  	txq_trans_cond_update(nq);
>  
> -	drops = 0;
> +	nxmit = 0;
>  	for (i = 0; i < num_frames; i++) {
>  		int err;
>  		struct xdp_frame *xdpf = frames[i];
>  
>  		err = igc_xdp_init_tx_descriptor(ring, xdpf);
> -		if (err) {
> -			xdp_return_frame_rx_napi(xdpf);
> -			drops++;
> -		}
> +		if (err)
> +			break;
> +		nxmit++;
>  	}
>  
>  	if (flags & XDP_XMIT_FLUSH)
> @@ -6520,7 +6519,7 @@ static int igc_xdp_xmit(struct net_device *dev, int num_frames,
>  
>  	__netif_tx_unlock(nq);
>  
> -	return num_frames - drops;
> +	return nxmit;
>  }
>  
>  static void igc_trigger_rxtxq_interrupt(struct igc_adapter *adapter,
> -- 
> 2.39.2
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ