[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJ8uoz2nYwGROzRwevT2X5U-XyGePrbJyM63iE1QZL=V-Y4pUg@mail.gmail.com>
Date: Tue, 15 Sep 2020 19:46:35 +0200
From: Magnus Karlsson <magnus.karlsson@...il.com>
To: Daniel Borkmann <daniel@...earbox.net>
Cc: "Karlsson, Magnus" <magnus.karlsson@...el.com>,
Björn Töpel <bjorn.topel@...el.com>,
Alexei Starovoitov <ast@...nel.org>,
Network Development <netdev@...r.kernel.org>,
Jonathan Lemon <jonathan.lemon@...il.com>,
A.Zema@...convsystems.com
Subject: Re: [PATCH bpf v4] xsk: do not discard packet when NETDEV_TX_BUSY
On Tue, Sep 15, 2020 at 5:49 PM Daniel Borkmann <daniel@...earbox.net> wrote:
>
> Hey Magnus,
>
> On 9/11/20 2:43 PM, Magnus Karlsson wrote:
> > From: Magnus Karlsson <magnus.karlsson@...el.com>
> >
> > In the skb Tx path, transmission of a packet is performed with
> > dev_direct_xmit(). When NETDEV_TX_BUSY is set in the drivers, it
> > signifies that it was not possible to send the packet right now,
> > please try later. Unfortunately, the xsk transmit code discarded the
> > packet and returned EBUSY to the application. Fix this unnecessary
> > packet loss, by not discarding the packet in the Tx ring and return
> > EAGAIN. As EAGAIN is returned to the application, it can then retry
> > the send operation later and the packet will then likely be sent as
> > the driver will then likely have space/resources to send the packet.
> >
> > In summary, EAGAIN tells the application that the packet was not
> > discarded from the Tx ring and that it needs to call send()
> > again. EBUSY, on the other hand, signifies that the packet was not
> > sent and discarded from the Tx ring. The application needs to put the
> > packet on the Tx ring again if it wants it to be sent.
> >
> > Fixes: 35fcde7f8deb ("xsk: support for Tx")
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@...el.com>
> > Reported-by: Arkadiusz Zema <A.Zema@...convsystems.com>
> > Suggested-by: Arkadiusz Zema <A.Zema@...convsystems.com>
> > Suggested-by: Daniel Borkmann <daniel@...earbox.net>
> > ---
> > v3->v4:
> > * Free the skb without triggering the drop trace when NETDEV_TX_BUSY
> > * Call consume_skb instead of kfree_skb when the packet has been
> > sent successfully for correct tracing
> > * Use sock_wfree as destructor when NETDEV_TX_BUSY
> > v1->v3:
> > * Hinder dev_direct_xmit() from freeing and completing the packet to
> > user space by manipulating the skb->users count as suggested by
> > Daniel Borkmann.
> > ---
> > net/xdp/xsk.c | 17 ++++++++++++++++-
> > 1 file changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > index c323162..d32e39d 100644
> > --- a/net/xdp/xsk.c
> > +++ b/net/xdp/xsk.c
> > @@ -377,15 +377,30 @@ static int xsk_generic_xmit(struct sock *sk)
> > skb_shinfo(skb)->destructor_arg = (void *)(long)desc.addr;
> > skb->destructor = xsk_destruct_skb;
> >
> > + /* Hinder dev_direct_xmit from freeing the packet and
> > + * therefore completing it in the destructor
> > + */
> > + refcount_inc(&skb->users);
> > err = dev_direct_xmit(skb, xs->queue_id);
> > + if (err == NETDEV_TX_BUSY) {
> > + /* Tell user-space to retry the send */
> > + skb->destructor = sock_wfree;
>
> I see, good catch, you need this one here as otherwise you leak wmem accounting
> given it's also part of xsk_destruct_skb() and we do free the prior allocated skb
> in this case.
>
> > + /* Free skb without triggering the perf drop trace */
> > + __kfree_skb(skb);
>
> As a minor nit, I would just use consume_skb(skb) here given this doesn't blindly
> ignore the skb_unref(). It's mostly about seeing where drops are happening so that
> tracepoint is set to kfree_skb() which is the more interesting one. Other than that
> looks good and ready to go. Thanks (& sorry for late reply)!
Thank you for reviewing this. I will spin a v5.
> > + err = -EAGAIN;
> > + goto out;
> > + }
> > +
> > xskq_cons_release(xs->tx);
> > /* Ignore NET_XMIT_CN as packet might have been sent */
> > - if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
> > + if (err == NET_XMIT_DROP) {
> > /* SKB completed but not sent */
> > + kfree_skb(skb);
> > err = -EBUSY;
> > goto out;
> > }
> >
> > + consume_skb(skb);
> > sent_frame = true;
> > }
> >
> >
>
Powered by blists - more mailing lists