lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 10 May 2017 16:56:20 +0800
From:   Jason Wang <jasowang@...hat.com>
To:     Anton Ivanov <anton.ivanov@...bridgegreys.com>,
        Stefan Hajnoczi <stefanha@...hat.com>
Cc:     "David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
        "Michael S. Tsirkin" <mst@...hat.com>
Subject: Re: DQL and TCQ_F_CAN_BYPASS destroy performance under virtualizaiton
 (Was: "Re: net_sched strange in 4.11")



On 2017年05月10日 13:28, Anton Ivanov wrote:
> On 10/05/17 03:18, Jason Wang wrote:
>>
>> On 2017年05月09日 23:11, Stefan Hajnoczi wrote:
>>> On Tue, May 09, 2017 at 08:46:46AM +0100, Anton Ivanov wrote:
>>>> I have figured it out. Two issues.
>>>>
>>>> 1) skb->xmit_more is hardly ever set under virtualization because
>>>> the qdisc
>>>> is usually bypassed because of TCQ_F_CAN_BYPASS. Once
>>>> TCQ_F_CAN_BYPASS is
>>>> set a virtual NIC driver is not likely see skb->xmit_more (this
>>>> answers my
>>>> "how does this work at all" question).
>>>>
>>>> 2) If that flag is turned off (I patched sched_generic to turn it
>>>> off in
>>>> pfifo_fast while testing), DQL keeps xmit_more from being set. If
>>>> the driver
>>>> is not DQL enabled xmit_more is never ever set. If the driver is DQL
>>>> enabled
>>>> the queue is adjusted to ensure xmit_more stops happening within
>>>> 10-15 xmit
>>>> cycles.
>>>>
>>>> That is plain *wrong* for virtual NICs - virtio, emulated NICs, etc.
>>>> There,
>>>> the BIG cost is telling the hypervisor that it needs to "kick" the
>>>> packets.
>>>> The cost of putting them into the vNIC buffers is negligible. You want
>>>> xmit_more to happen - it makes between 50% and 300% (depending on vNIC
>>>> design) difference. If there is no xmit_more the vNIC will immediately
>>>> "kick" the hypervisor and try to signal that  the packet needs to move
>>>> straight away (as for example in virtio_net).
>> How do you measure the performance? TCP or just measure pps?
> In this particular case - tcp from guest. I have a couple of other
> benchmarks (forwarding, etc).

One more question, is the number for virtio-net or other emulated vNIC?

>
>>>> In addition to that, the perceived line rate is proportional to this
>>>> cost,
>>>> so I am not sure that the current dql math holds. In fact, I think
>>>> it does
>>>> not - it is trying to adjust something which influences the
>>>> perceived line
>>>> rate.
>>>>
>>>> So - how do we turn BOTH bypass and DQL adjustment while under
>>>> virtualization and set them to be "always qdisc" + "always xmit_more
>>>> allowed"
>> Virtio-net net does not support BQL. Before commit ea7735d97ba9
>> ("virtio-net: move free_old_xmit_skbs"), it's even impossible to
>> support that since we don't have tx interrupt for each packet.  I
>> haven't measured the impact of xmit_more, maybe I was wrong but I
>> think it may help in some cases since it may improve the batching on
>> host more or less.
> If you do not support BQL, you might as well look the xmit_more part
> kick code path. Line 1127.
>
> bool kick = !skb->xmit_more; effectively means kick = true;
>
> It will never be triggered. You will be kicking each packet and per
> packet.

Probably not, we have several ways to try to suppress this on the virtio 
layer, host can give hints to disable the kicks through:

- explicitly set a flag
- implicitly by not publishing a new event idx

FYI, I can get 100-200 packets per vm exit when testing 64 byte 
TCP_STREAM using netperf.

> xmit_more is now set only out of BQL. If BQL is not enabled you
> never get it. Now, will the current dql code work correctly if you do
> not have a defined line rate and completion interrupts - no idea.
> Probably not. IMHO instead of trying to fix it there should be a way for
> a device or architecture to turn it off.

In fact BQL is not the only user for xmit_more. Pktgen with burst is 
another. Test does not show obvious difference if I set burst from 0 to 
64 since we already had other ways to avoid kicking host.

>
> To be clear - I ran into this working on my own drivers for UML, you are
> cc-ed because you are likely to be one of the most affected.

I'm still not quite sure the issue. Looks like virtio-net is ok since 
BQL is not supported and the impact of xmit_more could be ignored.

Thanks

>
> A.
>
>> Thanks
>>
>>>> A.
>>>>
>>>> P.S. Cc-ing virtio maintainer
>>> CCing Michael Tsirkin and Jason Wang, who are the core virtio and
>>> virtio-net maintainers.  (I maintain the vsock driver - it's unrelated
>>> to this discussion.)
>>>
>>>> A.
>>>>
>>>>
>>>> On 08/05/17 08:15, Anton Ivanov wrote:
>>>>> Hi all,
>>>>>
>>>>> I was revising some of my old work for UML to prepare it for
>>>>> submission
>>>>> and I noticed that skb->xmit_more does not seem to be set any more.
>>>>>
>>>>> I traced the issue as far as net/sched/sched_generic.c
>>>>>
>>>>> try_bulk_dequeue_skb() is never invoked (the drivers I am working
>>>>> on are
>>>>> dql enabled so that is not the problem).
>>>>>
>>>>> More interestingly, if I put a breakpoint and debug output into
>>>>> dequeue_skb() around line 147 - right before the bulk: tag that skb
>>>>> there is always NULL. ???
>>>>>
>>>>> Similarly, debug in pfifo_fast_dequeue shows only NULLs being
>>>>> dequeued.
>>>>> Again - ???
>>>>>
>>>>> First and foremost, I apologize for the silly question, but how can
>>>>> this
>>>>> work at all? I see the skbs showing up at the driver level, why are
>>>>> NULLs being returned at qdisc dequeue and where do the skbs at the
>>>>> driver level come from?
>>>>>
>>>>> Second, where should I look to fix it?
>>>>>
>>>>> A.
>>>>>
>>>> -- 
>>>> Anton R. Ivanov
>>>>
>>>> Cambridge Greys Limited, England company No 10273661
>>>> http://www.cambridgegreys.com/
>>>>
>>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ