lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 15 Sep 2020 17:49:32 +0200
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Magnus Karlsson <magnus.karlsson@...il.com>,
        magnus.karlsson@...el.com, bjorn.topel@...el.com, ast@...nel.org,
        netdev@...r.kernel.org, jonathan.lemon@...il.com
Cc:     A.Zema@...convsystems.com
Subject: Re: [PATCH bpf v4] xsk: do not discard packet when NETDEV_TX_BUSY

Hey Magnus,

On 9/11/20 2:43 PM, Magnus Karlsson wrote:
> From: Magnus Karlsson <magnus.karlsson@...el.com>
> 
> In the skb Tx path, transmission of a packet is performed with
> dev_direct_xmit(). When NETDEV_TX_BUSY is set in the drivers, it
> signifies that it was not possible to send the packet right now,
> please try later. Unfortunately, the xsk transmit code discarded the
> packet and returned EBUSY to the application. Fix this unnecessary
> packet loss, by not discarding the packet in the Tx ring and return
> EAGAIN. As EAGAIN is returned to the application, it can then retry
> the send operation later and the packet will then likely be sent as
> the driver will then likely have space/resources to send the packet.
> 
> In summary, EAGAIN tells the application that the packet was not
> discarded from the Tx ring and that it needs to call send()
> again. EBUSY, on the other hand, signifies that the packet was not
> sent and discarded from the Tx ring. The application needs to put the
> packet on the Tx ring again if it wants it to be sent.
> 
> Fixes: 35fcde7f8deb ("xsk: support for Tx")
> Signed-off-by: Magnus Karlsson <magnus.karlsson@...el.com>
> Reported-by: Arkadiusz Zema <A.Zema@...convsystems.com>
> Suggested-by: Arkadiusz Zema <A.Zema@...convsystems.com>
> Suggested-by: Daniel Borkmann <daniel@...earbox.net>
> ---
> v3->v4:
> * Free the skb without triggering the drop trace when NETDEV_TX_BUSY
> * Call consume_skb instead of kfree_skb when the packet has been
>    sent successfully for correct tracing
> * Use sock_wfree as destructor when NETDEV_TX_BUSY
> v1->v3:
> * Hinder dev_direct_xmit() from freeing and completing the packet to
>    user space by manipulating the skb->users count as suggested by
>    Daniel Borkmann.
> ---
>   net/xdp/xsk.c | 17 ++++++++++++++++-
>   1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index c323162..d32e39d 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -377,15 +377,30 @@ static int xsk_generic_xmit(struct sock *sk)
>   		skb_shinfo(skb)->destructor_arg = (void *)(long)desc.addr;
>   		skb->destructor = xsk_destruct_skb;
>   
> +		/* Hinder dev_direct_xmit from freeing the packet and
> +		 * therefore completing it in the destructor
> +		 */
> +		refcount_inc(&skb->users);
>   		err = dev_direct_xmit(skb, xs->queue_id);
> +		if  (err == NETDEV_TX_BUSY) {
> +			/* Tell user-space to retry the send */
> +			skb->destructor = sock_wfree;

I see, good catch, you need this one here as otherwise you leak wmem accounting
given it's also part of xsk_destruct_skb() and we do free the prior allocated skb
in this case.

> +			/* Free skb without triggering the perf drop trace */
> +			__kfree_skb(skb);

As a minor nit, I would just use consume_skb(skb) here given this doesn't blindly
ignore the skb_unref(). It's mostly about seeing where drops are happening so that
tracepoint is set to kfree_skb() which is the more interesting one. Other than that
looks good and ready to go. Thanks (& sorry for late reply)!

> +			err = -EAGAIN;
> +			goto out;
> +		}
> +
>   		xskq_cons_release(xs->tx);
>   		/* Ignore NET_XMIT_CN as packet might have been sent */
> -		if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
> +		if (err == NET_XMIT_DROP) {
>   			/* SKB completed but not sent */
> +			kfree_skb(skb);
>   			err = -EBUSY;
>   			goto out;
>   		}
>   
> +		consume_skb(skb);
>   		sent_frame = true;
>   	}
>   
> 

Powered by blists - more mailing lists