[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6e58cde8-9e38-079a-589d-7b7a860ef61e@iogearbox.net>
Date: Tue, 15 Sep 2020 17:49:32 +0200
From: Daniel Borkmann <daniel@...earbox.net>
To: Magnus Karlsson <magnus.karlsson@...il.com>,
magnus.karlsson@...el.com, bjorn.topel@...el.com, ast@...nel.org,
netdev@...r.kernel.org, jonathan.lemon@...il.com
Cc: A.Zema@...convsystems.com
Subject: Re: [PATCH bpf v4] xsk: do not discard packet when NETDEV_TX_BUSY
Hey Magnus,
On 9/11/20 2:43 PM, Magnus Karlsson wrote:
> From: Magnus Karlsson <magnus.karlsson@...el.com>
>
> In the skb Tx path, transmission of a packet is performed with
> dev_direct_xmit(). When NETDEV_TX_BUSY is set in the drivers, it
> signifies that it was not possible to send the packet right now,
> please try later. Unfortunately, the xsk transmit code discarded the
> packet and returned EBUSY to the application. Fix this unnecessary
> packet loss, by not discarding the packet in the Tx ring and return
> EAGAIN. As EAGAIN is returned to the application, it can then retry
> the send operation later and the packet will then likely be sent as
> the driver will then likely have space/resources to send the packet.
>
> In summary, EAGAIN tells the application that the packet was not
> discarded from the Tx ring and that it needs to call send()
> again. EBUSY, on the other hand, signifies that the packet was not
> sent and discarded from the Tx ring. The application needs to put the
> packet on the Tx ring again if it wants it to be sent.
>
> Fixes: 35fcde7f8deb ("xsk: support for Tx")
> Signed-off-by: Magnus Karlsson <magnus.karlsson@...el.com>
> Reported-by: Arkadiusz Zema <A.Zema@...convsystems.com>
> Suggested-by: Arkadiusz Zema <A.Zema@...convsystems.com>
> Suggested-by: Daniel Borkmann <daniel@...earbox.net>
> ---
> v3->v4:
> * Free the skb without triggering the drop trace when NETDEV_TX_BUSY
> * Call consume_skb instead of kfree_skb when the packet has been
> sent successfully for correct tracing
> * Use sock_wfree as destructor when NETDEV_TX_BUSY
> v1->v3:
> * Hinder dev_direct_xmit() from freeing and completing the packet to
> user space by manipulating the skb->users count as suggested by
> Daniel Borkmann.
> ---
> net/xdp/xsk.c | 17 ++++++++++++++++-
> 1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index c323162..d32e39d 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -377,15 +377,30 @@ static int xsk_generic_xmit(struct sock *sk)
> skb_shinfo(skb)->destructor_arg = (void *)(long)desc.addr;
> skb->destructor = xsk_destruct_skb;
>
> + /* Hinder dev_direct_xmit from freeing the packet and
> + * therefore completing it in the destructor
> + */
> + refcount_inc(&skb->users);
> err = dev_direct_xmit(skb, xs->queue_id);
> + if (err == NETDEV_TX_BUSY) {
> + /* Tell user-space to retry the send */
> + skb->destructor = sock_wfree;
I see, good catch, you need this one here as otherwise you leak wmem accounting
given it's also part of xsk_destruct_skb() and we do free the prior allocated skb
in this case.
> + /* Free skb without triggering the perf drop trace */
> + __kfree_skb(skb);
As a minor nit, I would just use consume_skb(skb) here given this doesn't blindly
ignore the skb_unref(). It's mostly about seeing where drops are happening so that
tracepoint is set to kfree_skb() which is the more interesting one. Other than that
looks good and ready to go. Thanks (& sorry for late reply)!
> + err = -EAGAIN;
> + goto out;
> + }
> +
> xskq_cons_release(xs->tx);
> /* Ignore NET_XMIT_CN as packet might have been sent */
> - if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
> + if (err == NET_XMIT_DROP) {
> /* SKB completed but not sent */
> + kfree_skb(skb);
> err = -EBUSY;
> goto out;
> }
>
> + consume_skb(skb);
> sent_frame = true;
> }
>
>
Powered by blists - more mailing lists