lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <e9d36f5e-fc14-9c54-c519-7e1615959973@gameservers.com>
Date:   Fri, 19 Jan 2018 15:28:00 -0500
From:   Brian Rak <brak@...eservers.com>
To:     qemu-devel <qemu-devel@...gnu.org>, netdev@...r.kernel.org
Subject: virtio_net occasionally stops sending packets

We've been running into a fairly persistent issue where virtio_net 
adapters will suddenly stop sending packets when running under KVM.  
This has persisted through several qemu versions, and a large number of 
guest kernel upgrades.

What we end up seeing is the guest continuing to receive packets, but 
refusing to transmit anything.

If we leave ping running for a minute or two, it will eventually start 
printing "ping: sendmsg: No buffer space available" messages.  tcpdump 
will show nothing being sent from the guest. `ip -s link` will show RX 
bytes/packets incrementing, but not TX. `tc -s qdisc show dev eth1` 
shows the 'dropped' counter incrementing with every new ping attempt.

I attempted to run 'dropwatch -l kas', which showed me a bunch of lines 
that looked like '4 drops at pfifo_fast_enqueue+85'.

So far, we haven't been able to consistently reproduce this.  We have a 
few guests that will hit this issue roughly once every two weeks, but we 
haven't been able to reproduce it on demand.  This seems to happen with 
guests that have more then one network adapter attached.  I do not think 
we've seen it on guests that only have one NIC.

We've tried guest kernels as new as 4.14.13, and qemu versions as new as 
2.11.0.  This doesn't appear to be related to the physical network at 
all, we've seen this happen with a variety of network backends:
* qemu 'multicast' networks
* macvtap attached to a vxlan interface
* bridged interface

We've tried disabling a variety of offloads (gso, tso4, tso6, ecn) from 
both the host and guest sides.  This didn't really have any effect.

The only way to fix this once it breaks is to restart the guest OS.  
`ifdown eth1; ifup eth1` doesn't seem to help.

How can I determine if this is a qemu issue, or an issue with the 
virtio_net driver?  We have not tried this with other virtual nic types 
yet.  I'm not sure if that would provide any useful information or not.  
We're still working on figuring out how to reproduce this, but I'm not 
terribly hopeful about coming up with a simple set of reproduction steps.

This was my post about it a few years ago: 
https://lists.nongnu.org/archive/html/qemu-devel/2015-01/msg03907.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ