[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+mtBx9B5aS9Jv6tpUJpmBkvQFnUqjcxcgTjbBHee8K0f1C9KQ@mail.gmail.com>
Date: Wed, 21 Jan 2015 13:54:52 -0800
From: Tom Herbert <therbert@...gle.com>
To: Jesse Gross <jesse@...ira.com>
Cc: Pravin Shelar <pshelar@...ira.com>,
David Miller <davem@...emloft.net>,
Linux Netdev List <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next 0/3] openvswitch: Add STT support.
On Wed, Jan 21, 2015 at 12:35 PM, Jesse Gross <jesse@...ira.com> wrote:
> On Wed, Jan 21, 2015 at 11:45 AM, Tom Herbert <therbert@...gle.com> wrote:
>>> I used bare metal intel servers. All VXLAN tests were done using linux
>>> kernel device without any VMs. All STT tests are done using OVS bridge
>>> and STT port.
>>>
>> So right off the bat you're running the baseline differently than the
>> target. Anyway, I cannot replicate your numbers for VXLAN, I see much
>> better performance and this with pretty old servers and dumb NICs. I
>> suspect you might not have GSO/GRO properly enabled, but instead of
>> trying to debug your setup, I'd rather restate my request that you
>> provide a network interface to STT so we can do our own fair
>> comparison.
>
> If I had to guess, I suspect the difference is that UDP RSS wasn't
> enabled, since it doesn't come that way out of the box. Regardless,
> you can clearly see a significant difference in single core
> performance and CPU consumption.
>
I'm not going to try to draw conclusions from data which is obviously
biased and incomplete. If you want to move forward on this, then just
provide network interface for STT so we can independently run our own
comparisons against other encapsulations like we've been doing all
along.
> STT has been fairly well known in network virtualization circles for
> the past few years and has some large deployments, so the reported
> performance is not a fluke. I remember Pankaj from Microsoft also
> mentioning to you that they weren't able to get performance to
> reasonable level without TSO. Totally different environment obviously
> but same reasoning.
>
>>>>> VXLAN:
>>>>> CPU
>>>>> Client: 1.6
>>>>> Server: 14.2
>>>>> Throughput: 5.6 Gbit/s
>>>>>
>>>>> VXLAN with rcsum:
>>>>> CPU
>>>>> Client: 0.89
>>>>> Server: 12.4
>>>>> Throughput: 5.8 Gbit/s
>>>>>
>>>>> STT:
>>>>> CPU
>>>>> Client: 1.28
>>>>> Server: 4.0
>>>>> Throughput: 9.5 Gbit/s
>>>>>
>>>> 9.5Gbps? Rounding error or is this 40Gbps or larger than 1500 byte MTU?
>>>>
>>> Nope, its same as VXLAN setup, 10Gbps NIC with 1500MTU.
>>>
>> That would exceed that theoretical maximum for TCP over 10Gbps
>> Ethernet. How are you measuring throughput? How many bytes of protocol
>> headers are in STT case?
>
> For large packet cases, STT actually has less header overhead compared
> to the unencapsulated traffic stream. This is because for a group of
> STT packets generated by a TSO burst from the guest there is only a
> single copy of the inner header. Even though TCP headers are used for
> encapsulation, there are no options - as opposed to the inner headers,
> which typically contain timestamps. Over the course of the ~45 packets
> that could be generated from a maximum sized transmission, this
> results in negative encapsulation overhead.
>
> I would recommend you take a look at the draft if you haven't already:
> http://tools.ietf.org/html/draft-davie-stt-06
>
> It is currently in the final stages of the RFC publication process.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists