[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53ED1CB2.7050006@intel.com>
Date: Thu, 14 Aug 2014 13:31:46 -0700
From: Alexander Duyck <alexander.h.duyck@...el.com>
To: Rick Jones <rick.jones2@...com>,
Eric Dumazet <eric.dumazet@...il.com>
CC: David Miller <davem@...emloft.net>, netdev <netdev@...r.kernel.org>
Subject: Re: Performance regression on kernels 3.10 and newer
On 08/14/2014 12:59 PM, Rick Jones wrote:
> On 08/14/2014 11:46 AM, Eric Dumazet wrote:
>
>> I believe you answered your own question : prequeue mode does not work
>> very well when one host has hundred of active TCP flows to one other.
>>
>> In real life, applications do not use prequeue, because nobody wants one
>> thread per flow.
My concern here is that netperf is a standard tool to use for testing
network performance, and the kernel default is to run with
tcp_low_latency disabled. As such the prequeue is a part of the
standard path is it not? If the prequeue isn't really useful anymore
should we consider pulling it out of the kernel, or disabling it by
making tcp_low_latency the default?
>> Each socket has its own dst now route cache was removed, but if your
>> netperf migrates cpu (and NUMA node), we do not detect the dst should be
>> re-created onto a different NUMA node.
>
> Presumably, the -T $i,$j option in Alex's netperf command lines will
> have bound netperf and netserver to a specific CPU where they will have
> remained.
>
> rick jones
Yes, my test was affinitized per CPU. I was originally trying to test
some local vs remote NUMA performance numbers. Also as I mentioned I
was using the ixgbe driver with 82599 and I had ATR enabled so the
receive flow was affinitized to the queue as well. We shouldn't have
had any cross node chatter as a result of that.
Thanks,
Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists