lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Mon, 17 Aug 2020 16:08:27 +0200
From:   Magnus Karlsson <magnus.karlsson@...il.com>
To:     Daniel Borkmann <daniel@...earbox.net>
Cc:     Magnus Karlsson <magnus.karlsson@...el.com>,
        Björn Töpel <bjorn.topel@...el.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Network Development <netdev@...r.kernel.org>,
        Jonathan Lemon <jonathan.lemon@...il.com>,
        A.Zema@...convsystems.com
Subject: Re: [PATCH bpf v3] xsk: do not discard packet when QUEUE_STATE_FROZEN

On Tue, Jul 21, 2020 at 10:46 PM Daniel Borkmann <daniel@...earbox.net> wrote:
>
> On 7/20/20 3:53 PM, Magnus Karlsson wrote:
> > In the skb Tx path, transmission of a packet is performed with
> > dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
> > routines, it returns NETDEV_TX_BUSY signifying that it was not
> > possible to send the packet now, please try later. Unfortunately, the
> > xsk transmit code discarded the packet and returned EBUSY to the
> > application. Fix this unnecessary packet loss, by not discarding the
> > packet in the Tx ring and return EAGAIN. As EAGAIN is returned to the
> > application, it can then retry the send operation and the packet will
> > finally be sent as we will likely not be in the QUEUE_STATE_FROZEN
> > state anymore. So EAGAIN tells the application that the packet was not
> > discarded from the Tx ring and that it needs to call send()
> > again. EBUSY, on the other hand, signifies that the packet was not
> > sent and discarded from the Tx ring. The application needs to put the
> > packet on the Tx ring again if it wants it to be sent.
> >
> > Fixes: 35fcde7f8deb ("xsk: support for Tx")
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@...el.com>
> > Reported-by: Arkadiusz Zema <A.Zema@...convsystems.com>
> > Suggested-by: Arkadiusz Zema <A.Zema@...convsystems.com>
> > Suggested-by: Daniel Borkmann <daniel@...earbox.net>
> > ---
> > v1->v3:
> > * Hinder dev_direct_xmit() from freeing and completing the packet to
> >    user space by manipulating the skb->users count as suggested by
> >    Daniel Borkmann.
> > ---
> >   net/xdp/xsk.c | 15 ++++++++++++++-
> >   1 file changed, 14 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > index 3700266..9e95c85 100644
> > --- a/net/xdp/xsk.c
> > +++ b/net/xdp/xsk.c
> > @@ -375,10 +375,23 @@ static int xsk_generic_xmit(struct sock *sk)
> >               skb_shinfo(skb)->destructor_arg = (void *)(long)desc.addr;
> >               skb->destructor = xsk_destruct_skb;
> >
> > +             /* Hinder dev_direct_xmit from freeing the packet and
> > +              * therefore completing it in the destructor
> > +              */
> > +             refcount_inc(&skb->users);
> >               err = dev_direct_xmit(skb, xs->queue_id);
> > +             if  (err == NETDEV_TX_BUSY) {
> > +                     /* QUEUE_STATE_FROZEN, tell app to retry the send */
> > +                     skb->destructor = NULL;
> > +                     kfree_skb(skb);
> > +                     err = -EAGAIN;
> > +                     goto out;
> > +             }
> > +
> >               xskq_cons_release(xs->tx);
> > +             kfree_skb(skb);
>
> What happens if this was properly 'consumed'. If you call kfree_skb() for these pkts,
> then doesn't this confuse perf drop monitor with false positives?

I have been on extended vacation, so sorry for the delay. The
trace_kfree_skb() is after the recounting check in kfree_skb. That
would mean that all "consumption"/freeing is moved to the code in this
patch. So if an skb was consumed/freed in dev_direct_xmit before this
patch, that same skb would show as consumed/freed in the
xsk_generic_xmit code instead with this patch. Do not see how we can
change that using this approach with refcounting, or do you have an
idea?

Another idea would be to just modify dev_direct_xmit() along the lines
that I originally suggested. But that would mean a small change to
AF_PACKET too as well as a driver that uses it for testing. So more
intrusive than the above, but the perf drop monitor would record it in
the correct place. Happy for suggestions on how to proceed.

Thanks: Magnus

> >               /* Ignore NET_XMIT_CN as packet might have been sent */
> > -             if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
> > +             if (err == NET_XMIT_DROP) {
> >                       /* SKB completed but not sent */
> >                       err = -EBUSY;
> >                       goto out;
> >
>

Powered by blists - more mailing lists