lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 17 Jan 2023 11:48:13 +0800
From:   Jason Wang <jasowang@...hat.com>
To:     Laurent Vivier <lvivier@...hat.com>
Cc:     "Michael S. Tsirkin" <mst@...hat.com>,
        Jakub Kicinski <kuba@...nel.org>, Wei Wang <weiwan@...gle.com>,
        David Miller <davem@...emloft.net>, netdev@...r.kernel.org,
        Willem de Bruijn <willemb@...gle.com>,
        virtualization@...ts.linux-foundation.org,
        Stefano Brivio <sbrivio@...hat.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 4/4] virtio_net: disable cb aggressively

On Mon, Jan 16, 2023 at 9:41 PM Laurent Vivier <lvivier@...hat.com> wrote:
>
> Hi Michael,
>
> On 5/26/21 10:24, Michael S. Tsirkin wrote:
> > There are currently two cases where we poll TX vq not in response to a
> > callback: start xmit and rx napi.  We currently do this with callbacks
> > enabled which can cause extra interrupts from the card.  Used not to be
> > a big issue as we run with interrupts disabled but that is no longer the
> > case, and in some cases the rate of spurious interrupts is so high
> > linux detects this and actually kills the interrupt.
> >
> > Fix up by disabling the callbacks before polling the tx vq.
> >
> > Signed-off-by: Michael S. Tsirkin <mst@...hat.com>
> > ---
> >   drivers/net/virtio_net.c | 16 ++++++++++++----
> >   1 file changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index c29f42d1e04f..a83dc038d8af 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -1433,7 +1433,10 @@ static void virtnet_poll_cleantx(struct receive_queue *rq)
> >               return;
> >
> >       if (__netif_tx_trylock(txq)) {
> > -             free_old_xmit_skbs(sq, true);
> > +             do {
> > +                     virtqueue_disable_cb(sq->vq);
> > +                     free_old_xmit_skbs(sq, true);
> > +             } while (unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
> >
> >               if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
> >                       netif_tx_wake_queue(txq);
> > @@ -1605,12 +1608,17 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> >       struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum);
> >       bool kick = !netdev_xmit_more();
> >       bool use_napi = sq->napi.weight;
> > +     unsigned int bytes = skb->len;
> >
> >       /* Free up any pending old buffers before queueing new ones. */
> > -     free_old_xmit_skbs(sq, false);
> > +     do {
> > +             if (use_napi)
> > +                     virtqueue_disable_cb(sq->vq);
> >
> > -     if (use_napi && kick)
> > -             virtqueue_enable_cb_delayed(sq->vq);
> > +             free_old_xmit_skbs(sq, false);
> > +
> > +     } while (use_napi && kick &&
> > +            unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
> >
> >       /* timestamp packet in software */
> >       skb_tx_timestamp(skb);
>
> This patch seems to introduce a problem with QEMU connected to passt using netdev stream
> backend.
>
> When I run an iperf3 test I get after 1 or 2 seconds of test:
>
> [  254.035559] NETDEV WATCHDOG: ens3 (virtio_net): transmit queue 0 timed out
> ...
> [  254.060962] virtio_net virtio1 ens3: TX timeout on queue: 0, sq: output.0, vq: 0x1,
> name: output.0, 8856000 usecs ago
> [  259.155150] virtio_net virtio1 ens3: TX timeout on queue: 0, sq: output.0, vq: 0x1,
> name: output.0, 13951000 usecs ago
>
> In QEMU, I can see in virtio_net_tx_bh() the function virtio_net_flush_tx() has flushed
> all the queue entries and re-enabled the queue notification with
> virtio_queue_set_notification() and tries to flush again the queue and as it is empty it
> does nothing more and then rely on a kernel notification to re-enable the bottom half
> function. As this notification never comes the queue is stuck and kernel add entries but
> QEMU doesn't remove them:
>
> 2812 static void virtio_net_tx_bh(void *opaque)
> 2813 {
> ...
> 2833     ret = virtio_net_flush_tx(q);
>
> -> flush the queue and ret is not an error and not n->tx_burst (that would re-schedule the
> function)
>
> ...
> 2850     virtio_queue_set_notification(q->tx_vq, 1);
>
> -> re-enable the queue notification
>
> 2851     ret = virtio_net_flush_tx(q);
> 2852     if (ret == -EINVAL) {
> 2853         return;
> 2854     } else if (ret > 0) {
> 2855         virtio_queue_set_notification(q->tx_vq, 0);
> 2856         qemu_bh_schedule(q->tx_bh);
> 2857         q->tx_waiting = 1;
> 2858     }
>
> -> ret is 0, exit the function without re-scheduling the function.
> ...
> 2859 }
>
> If I revert this patch in the kernel (a7766ef18b33 ("virtio_net: disable cb
> aggressively")), it works fine.
>
> How to reproduce it:
>
> I start passt (https://passt.top/passt):
>
> passt -f
>
> and then QEMU
>
> qemu-system-x86_64 ... --netdev
> stream,id=netdev0,server=off,addr.type=unix,addr.path=/tmp/passt_1.socket -device
> virtio-net,mac=9a:2b:2c:2d:2e:2f,netdev=netdev0
>
> Host side:
>
> sysctl -w net.core.rmem_max=134217728
> sysctl -w net.core.wmem_max=134217728
> iperf3 -s
>
> Guest side:
>
> sysctl -w net.core.rmem_max=536870912
> sysctl -w net.core.wmem_max=536870912
>
> ip link set dev $DEV mtu 256
> iperf3 -c $HOST -t10 -i0 -Z -P 8 -l 1M --pacing-timer 1000000 -w 4M
>
> Any idea of what is the problem?

This looks similar to what I spot and try to fix in:

[PATCH net V3] virtio-net: correctly enable callback during start_xmit

(I've cced you in this version).

Thanks

>
> Thanks,
> Laurent
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ