lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 24 Feb 2015 09:03:48 -0800
From:	Rick Jones <rick.jones2@...com>
To:	Michael Kazakov <michael@...athonbet.ru>, netdev@...r.kernel.org
Subject: Re: Packet dropps in virtual tap interface

On 02/23/2015 11:18 PM, Michael Kazakov wrote:
>
>
> Hello. We use our highly load system in OpenStack environment. And faced
> with serious problems in network highload guests. At a relatively high
> CPU load of hypervisor (40-60%) virtual network interfaces of this
> guests starts to droppart of the packets. We use for virtualization
> qemu-kvm with vhost driver: "... -netdev
> tap,fd=30,id=hostnet0,vhost=on,vhostfd=31 -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:86:67:7b,bus=pci.0,addr=0x3...".
>
>
> The problem can be seen with the ifconfig utility:
> tape4009073-0b Link encap:Ethernet  HWaddr fe:16:3e:86:67:7b
>            inet6 addr: fe80::fc16:3eff:fe86:677b/64 Scope:Link
>            UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>            RX packets:1587622634 errors:0 dropped:0 overruns:0 frame:0
>            TX packets:1484106438 errors:0 dropped:460259 overruns:0
> carrier:0
>            collisions:0 txqueuelen:500
>            RX bytes:877878711500 (877.8 GB)  TX bytes:3071846828531 (3.0
> TB)
>
> Could you do a little look at our problem and give advice which
> direction to continue our investigation?

If I recall the OpenStack plumbing correctly, the TX direction on the 
tap device is inbound to the instance.  You could, I suppose, try 
increasing the size of the txqueuelen via ifconfig, but you may want to 
triple check that the KVM I/O thread (?) isn't running at 100%. 
Particularly if when you say the hypervisor is running at 40-60% CPU 
utilization you mean overall CPU utilization of a multiple CPU system.

I'm assuming that in broad handwaving terms, the TX queue of the tap 
device is behaving something like the SO_RCVBUF of a UDP socket would in 
a "normal"  system - if the VM (tap device case) or receiving process 
(UDP socket case) is held-up for a little while, that buffer is there to 
try to pick-up the slack.  If the VM is held-off from running long 
enough that queue will overflow and traffic will be dropped.

That might be the "raising the bridge" side of things.  The "lower the 
river" side might be to find-out why and for how long the VM is being 
held-off from running and see if you can address that.

rick jones
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ