[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250403180403-mutt-send-email-mst@kernel.org>
Date: Thu, 3 Apr 2025 18:05:43 -0400
From: "Michael S. Tsirkin" <mst@...hat.com>
To: Markus Fohrer <markus.fohrer@...ked.de>
Cc: virtualization@...ts.linux-foundation.org, jasowang@...hat.com,
davem@...emloft.net, edumazet@...gle.com, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with
Linux 6.8+
On Thu, Apr 03, 2025 at 11:24:43PM +0200, Markus Fohrer wrote:
> Am Donnerstag, dem 03.04.2025 um 17:06 -0400 schrieb Michael S.
> Tsirkin:
> > On Thu, Apr 03, 2025 at 10:07:12PM +0200, Markus Fohrer wrote:
> > > Am Donnerstag, dem 03.04.2025 um 10:03 -0400 schrieb Michael S.
> > > Tsirkin:
> > > > On Thu, Apr 03, 2025 at 03:51:01PM +0200, Markus Fohrer wrote:
> > > > > Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
> > > > > Tsirkin:
> > > > > > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer
> > > > > > wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > I'm observing a significant performance regression in KVM
> > > > > > > guest
> > > > > > > VMs
> > > > > > > using virtio-net with recent Linux kernels (6.8.1+ and
> > > > > > > 6.14).
> > > > > > >
> > > > > > > When running on a host system equipped with a Broadcom
> > > > > > > NetXtreme-E
> > > > > > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in
> > > > > > > the
> > > > > > > guest drops to 100–200 KB/s. The same guest configuration
> > > > > > > performs
> > > > > > > normally (~100 MB/s) when using kernel 6.8.0 or when the VM
> > > > > > > is
> > > > > > > moved to a host with Intel NICs.
> > > > > > >
> > > > > > > Test environment:
> > > > > > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > > > > > - Guest: Linux with virtio-net interface
> > > > > > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host
> > > > > > > level)
> > > > > > > - CPU: AMD EPYC
> > > > > > > - Storage: virtio-scsi
> > > > > > > - VM network: virtio-net, virtio-scsi (no CPU or IO
> > > > > > > bottlenecks)
> > > > > > > - Traffic test: iperf3, scp, wget consistently slow in
> > > > > > > guest
> > > > > > >
> > > > > > > This issue is not present:
> > > > > > > - On 6.8.0
> > > > > > > - On hosts with Intel NICs (same VM config)
> > > > > > >
> > > > > > > I have bisected the issue to the following upstream commit:
> > > > > > >
> > > > > > > 49d14b54a527 ("virtio-net: Suppress tx timeout warning
> > > > > > > for
> > > > > > > small
> > > > > > > tx")
> > > > > > > https://git.kernel.org/linus/49d14b54a527
> > > > > >
> > > > > > Thanks a lot for the info!
> > > > > >
> > > > > >
> > > > > > both the link and commit point at:
> > > > > >
> > > > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > > > Author: Eric Dumazet <edumazet@...gle.com>
> > > > > > Date: Thu Sep 26 16:58:36 2024 +0000
> > > > > >
> > > > > > net: test for not too small csum_start in
> > > > > > virtio_net_hdr_to_skb()
> > > > > >
> > > > > >
> > > > > > is this what you mean?
> > > > > >
> > > > > > I don't know which commit is "virtio-net: Suppress tx timeout
> > > > > > warning
> > > > > > for small tx"
> > > > > >
> > > > > >
> > > > > >
> > > > > > > Reverting this commit restores normal network performance
> > > > > > > in
> > > > > > > affected guest VMs.
> > > > > > >
> > > > > > > I’m happy to provide more data or assist with testing a
> > > > > > > potential
> > > > > > > fix.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Markus Fohrer
> > > > > >
> > > > > >
> > > > > > Thanks! First I think it's worth checking what is the setup,
> > > > > > e.g.
> > > > > > which offloads are enabled.
> > > > > > Besides that, I'd start by seeing what's doing on. Assuming
> > > > > > I'm
> > > > > > right
> > > > > > about
> > > > > > Eric's patch:
> > > > > >
> > > > > > diff --git a/include/linux/virtio_net.h
> > > > > > b/include/linux/virtio_net.h
> > > > > > index 276ca543ef44d8..02a9f4dc594d02 100644
> > > > > > --- a/include/linux/virtio_net.h
> > > > > > +++ b/include/linux/virtio_net.h
> > > > > > @@ -103,8 +103,10 @@ static inline int
> > > > > > virtio_net_hdr_to_skb(struct
> > > > > > sk_buff *skb,
> > > > > >
> > > > > > if (!skb_partial_csum_set(skb, start, off))
> > > > > > return -EINVAL;
> > > > > > + if (skb_transport_offset(skb) < nh_min_len)
> > > > > > + return -EINVAL;
> > > > > >
> > > > > > - nh_min_len = max_t(u32, nh_min_len,
> > > > > > skb_transport_offset(skb));
> > > > > > + nh_min_len = skb_transport_offset(skb);
> > > > > > p_off = nh_min_len + thlen;
> > > > > > if (!pskb_may_pull(skb, p_off))
> > > > > > return -EINVAL;
> > > > > >
> > > > > >
> > > > > > sticking a printk before return -EINVAL to show the offset
> > > > > > and
> > > > > > nh_min_len
> > > > > > would be a good 1st step. Thanks!
> > > > > >
> > > > >
> > > > >
> > > > > Hi Eric,
> > > > >
> > > > > thanks a lot for the quick response — and yes, you're
> > > > > absolutely
> > > > > right.
> > > > >
> > > > > Apologies for the confusion: I mistakenly wrote the wrong
> > > > > commit
> > > > > description in my initial mail.
> > > > >
> > > > > The correct commit is indeed:
> > > > >
> > > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > > Author: Eric Dumazet <edumazet@...gle.com>
> > > > > Date: Thu Sep 26 16:58:36 2024 +0000
> > > > >
> > > > > net: test for not too small csum_start in
> > > > > virtio_net_hdr_to_skb()
> > > > >
> > > > > This is the one I bisected and which causes the performance
> > > > > regression
> > > > > in my environment.
> > > > >
> > > > > Thanks again,
> > > > > Markus
> > > >
> > > >
> > > > I'm not Eric but good to know.
> > > > Alright, so I would start with the two items: device features and
> > > > printk.
> > > >
> > >
> > > as requested, here’s the device/feature information from the guest
> > > running kernel 6.14 (mainline):
> > >
> > > Interface: ens18
> > >
> > > ethtool -i ens18:
> > > driver: virtio_net
> > > version: 1.0.0
> > > firmware-version:
> > > expansion-rom-version:
> > > bus-info: 0000:00:12.0
> > > supports-statistics: yes
> > > supports-test: no
> > > supports-eeprom-access: no
> > > supports-register-dump: no
> > > supports-priv-flags: no
> > >
> > >
> > > ethtool -k ens18:
> > > Features for ens18:
> > > rx-checksumming: on [fixed]
> > > tx-checksumming: on
> > > tx-checksum-ipv4: off [fixed]
> > > tx-checksum-ip-generic: on
> > > tx-checksum-ipv6: off [fixed]
> > > tx-checksum-fcoe-crc: off [fixed]
> > > tx-checksum-sctp: off [fixed]
> > > scatter-gather: on
> > > tx-scatter-gather: on
> > > tx-scatter-gather-fraglist: off [fixed]
> > > tcp-segmentation-offload: on
> > > tx-tcp-segmentation: on
> > > tx-tcp-ecn-segmentation: on
> > > tx-tcp-mangleid-segmentation: off
> > > tx-tcp6-segmentation: on
> > > generic-segmentation-offload: on
> > > generic-receive-offload: on
> > > large-receive-offload: off [fixed]
> > > rx-vlan-offload: off [fixed]
> > > tx-vlan-offload: off [fixed]
> > > ntuple-filters: off [fixed]
> > > receive-hashing: off [fixed]
> > > highdma: on [fixed]
> > > rx-vlan-filter: on [fixed]
> > > vlan-challenged: off [fixed]
> > > tx-gso-robust: on [fixed]
> > > tx-fcoe-segmentation: off [fixed]
> > > tx-gre-segmentation: off [fixed]
> > > tx-gre-csum-segmentation: off [fixed]
> > > tx-ipxip4-segmentation: off [fixed]
> > > tx-ipxip6-segmentation: off [fixed]
> > > tx-udp_tnl-segmentation: off [fixed]
> > > tx-udp_tnl-csum-segmentation: off [fixed]
> > > tx-gso-partial: off [fixed]
> > > tx-tunnel-remcsum-segmentation: off [fixed]
> > > tx-sctp-segmentation: off [fixed]
> > > tx-esp-segmentation: off [fixed]
> > > tx-udp-segmentation: off
> > > tx-gso-list: off [fixed]
> > > tx-nocache-copy: off
> > > loopback: off [fixed]
> > > rx-fcs: off [fixed]
> > > rx-all: off [fixed]
> > > tx-vlan-stag-hw-insert: off [fixed]
> > > rx-vlan-stag-hw-parse: off [fixed]
> > > rx-vlan-stag-filter: off [fixed]
> > > l2-fwd-offload: off [fixed]
> > > hw-tc-offload: off [fixed]
> > > esp-hw-offload: off [fixed]
> > > esp-tx-csum-hw-offload: off [fixed]
> > > rx-udp_tunnel-port-offload: off [fixed]
> > > tls-hw-tx-offload: off [fixed]
> > > tls-hw-rx-offload: off [fixed]
> > > rx-gro-hw: on
> > > tls-hw-record: off [fixed]
> > > rx-gro-list: off
> > > macsec-hw-offload: off [fixed]
> > > rx-udp-gro-forwarding: off
> > > hsr-tag-ins-offload: off [fixed]
> > > hsr-tag-rm-offload: off [fixed]
> > > hsr-fwd-offload: off [fixed]
> > > hsr-dup-offload: off [fixed]
> > >
> > > ethtool ens18:
> > > Settings for ens18:
> > > Supported ports: [ ]
> > > Supported link modes: Not reported
> > > Supported pause frame use: No
> > > Supports auto-negotiation: No
> > > Supported FEC modes: Not reported
> > > Advertised link modes: Not reported
> > > Advertised pause frame use: No
> > > Advertised auto-negotiation: No
> > > Advertised FEC modes: Not reported
> > > Speed: Unknown!
> > > Duplex: Unknown! (255)
> > > Auto-negotiation: off
> > > Port: Other
> > > PHYAD: 0
> > > Transceiver: internal
> > > netlink error: Operation not permitted
> > > Link detected: yes
> > >
> > >
> > > Kernel log (journalctl -k):
> > > Apr 03 19:50:37 kb-test.allod.com kernel: virtio_scsi virtio2:
> > > 4/0/0
> > > default/read/poll queues
> > > Apr 03 19:50:37 kb-test.allod.com kernel: virtio_net virtio1 ens18:
> > > renamed from eth0
> > >
> > > Let me know if you’d like comparison data from kernel 6.11 or any
> > > additional tests
> >
> >
> > I think let's redo bisect first then I will suggest which traces to
> > add.
> >
>
> The build with the added printk is currently running. I’ll test it
> shortly and report the results.
>
> Should the bisect be done between v6.11 and v6.12, or v6.11 and v6.14?
The commit you showed is between 6.11 and 6.12. Having said that,
you can manually checkout 49d14b54a527289d09a9480f214b8c586322310a
and 49d14b54a527289d09a9480f214b8c586322310a~1 and record
the results with git bisect bad/good and if it works
then git bisect will stop immediately for you.
Powered by blists - more mailing lists