[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZdNF0ArChUkGzk42@boxer>
Date: Mon, 19 Feb 2024 13:13:04 +0100
From: Maciej Fijalkowski <maciej.fijalkowski@...el.com>
To: Florian Kauer <florian.kauer@...utronix.de>
CC: Jesse Brandeburg <jesse.brandeburg@...el.com>, Tony Nguyen
<anthony.l.nguyen@...el.com>, "David S. Miller" <davem@...emloft.net>, "Eric
Dumazet" <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
<pabeni@...hat.com>, Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann
<daniel@...earbox.net>, Jesper Dangaard Brouer <hawk@...nel.org>, "John
Fastabend" <john.fastabend@...il.com>, Jithu Joseph <jithu.joseph@...el.com>,
Andre Guedes <andre.guedes@...el.com>, Vedang Patel <vedang.patel@...el.com>,
<kurt@...utronix.de>, <intel-wired-lan@...ts.osuosl.org>,
<netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<bpf@...r.kernel.org>
Subject: Re: [PATCH net 1/1] igc: avoid returning frame twice in XDP_REDIRECT
On Mon, Feb 19, 2024 at 10:08:43AM +0100, Florian Kauer wrote:
> When a frame can not be transmitted in XDP_REDIRECT
> (e.g. due to a full queue), it is necessary to free
> it by calling xdp_return_frame_rx_napi.
>
> However, this is the reponsibility of the caller of
> the ndo_xdp_xmit (see for example bq_xmit_all in
> kernel/bpf/devmap.c) and thus calling it inside
> igc_xdp_xmit (which is the ndo_xdp_xmit of the igc
> driver) as well will lead to memory corruption.
>
> In fact, bq_xmit_all expects that it can return all
> frames after the last successfully transmitted one.
> Therefore, break for the first not transmitted frame,
> but do not call xdp_return_frame_rx_napi in igc_xdp_xmit.
> This is equally implemented in other Intel drivers
> such as the igb.
>
> There are two alternatives to this that were rejected:
> 1. Return num_frames as all the frames would have been
> transmitted and release them inside igc_xdp_xmit.
> While it might work technically, it is not what
> the return value is meant to repesent (i.e. the
> number of SUCCESSFULLY transmitted packets).
> 2. Rework kernel/bpf/devmap.c and all drivers to
> support non-consecutively dropped packets.
> Besides being complex, it likely has a negative
> performance impact without a significant gain
> since it is anyway unlikely that the next frame
> can be transmitted if the previous one was dropped.
>
> The memory corruption can be reproduced with
> the following script which leads to a kernel panic
> after a few seconds. It basically generates more
> traffic than a i225 NIC can transmit and pushes it
> via XDP_REDIRECT from a virtual interface to the
> physical interface where frames get dropped.
>
> #!/bin/bash
> INTERFACE=enp4s0
> INTERFACE_IDX=`cat /sys/class/net/$INTERFACE/ifindex`
>
> sudo ip link add dev veth1 type veth peer name veth2
> sudo ip link set up $INTERFACE
> sudo ip link set up veth1
> sudo ip link set up veth2
>
> cat << EOF > redirect.bpf.c
>
> SEC("prog")
> int redirect(struct xdp_md *ctx)
> {
> return bpf_redirect($INTERFACE_IDX, 0);
> }
>
> char _license[] SEC("license") = "GPL";
> EOF
> clang -O2 -g -Wall -target bpf -c redirect.bpf.c -o redirect.bpf.o
> sudo ip link set veth2 xdp obj redirect.bpf.o
>
> cat << EOF > pass.bpf.c
>
> SEC("prog")
> int pass(struct xdp_md *ctx)
> {
> return XDP_PASS;
> }
>
> char _license[] SEC("license") = "GPL";
> EOF
> clang -O2 -g -Wall -target bpf -c pass.bpf.c -o pass.bpf.o
> sudo ip link set $INTERFACE xdp obj pass.bpf.o
>
> cat << EOF > trafgen.cfg
>
> {
> /* Ethernet Header */
> 0xe8, 0x6a, 0x64, 0x41, 0xbf, 0x46,
> 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
> const16(ETH_P_IP),
>
> /* IPv4 Header */
> 0b01000101, 0, # IPv4 version, IHL, TOS
> const16(1028), # IPv4 total length (UDP length + 20 bytes (IP header))
> const16(2), # IPv4 ident
> 0b01000000, 0, # IPv4 flags, fragmentation off
> 64, # IPv4 TTL
> 17, # Protocol UDP
> csumip(14, 33), # IPv4 checksum
>
> /* UDP Header */
> 10, 0, 1, 1, # IP Src - adapt as needed
> 10, 0, 1, 2, # IP Dest - adapt as needed
> const16(6666), # UDP Src Port
> const16(6666), # UDP Dest Port
> const16(1008), # UDP length (UDP header 8 bytes + payload length)
> csumudp(14, 34), # UDP checksum
>
> /* Payload */
> fill('W', 1000),
> }
> EOF
>
> sudo trafgen -i trafgen.cfg -b3000MB -o veth1 --cpp
>
> Fixes: 4ff320361092 ("igc: Add support for XDP_REDIRECT action")
> Signed-off-by: Florian Kauer <florian.kauer@...utronix.de>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@...el.com>
Apparently fdc13979f91e ("bpf, devmap: Move drop error path to devmap for
XDP_REDIRECT") addressed all ndo_xdp_xmit callbacks but igc support was
added shortly after that...?
> ---
> drivers/net/ethernet/intel/igc/igc_main.c | 13 ++++++-------
> 1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
> index ba8d3fe186ae..81c21a893ede 100644
> --- a/drivers/net/ethernet/intel/igc/igc_main.c
> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
> @@ -6487,7 +6487,7 @@ static int igc_xdp_xmit(struct net_device *dev, int num_frames,
> int cpu = smp_processor_id();
> struct netdev_queue *nq;
> struct igc_ring *ring;
> - int i, drops;
> + int i, nxmit;
>
> if (unlikely(!netif_carrier_ok(dev)))
> return -ENETDOWN;
> @@ -6503,16 +6503,15 @@ static int igc_xdp_xmit(struct net_device *dev, int num_frames,
> /* Avoid transmit queue timeout since we share it with the slow path */
> txq_trans_cond_update(nq);
>
> - drops = 0;
> + nxmit = 0;
> for (i = 0; i < num_frames; i++) {
> int err;
> struct xdp_frame *xdpf = frames[i];
>
> err = igc_xdp_init_tx_descriptor(ring, xdpf);
> - if (err) {
> - xdp_return_frame_rx_napi(xdpf);
> - drops++;
> - }
> + if (err)
> + break;
> + nxmit++;
> }
>
> if (flags & XDP_XMIT_FLUSH)
> @@ -6520,7 +6519,7 @@ static int igc_xdp_xmit(struct net_device *dev, int num_frames,
>
> __netif_tx_unlock(nq);
>
> - return num_frames - drops;
> + return nxmit;
> }
>
> static void igc_trigger_rxtxq_interrupt(struct igc_adapter *adapter,
> --
> 2.39.2
>
Powered by blists - more mailing lists