[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1505884427.29839.84.camel@edumazet-glaptop3.roam.corp.google.com>
Date: Tue, 19 Sep 2017 22:13:47 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: "Samudrala, Sridhar" <sridhar.samudrala@...el.com>
Cc: Tom Herbert <tom@...bertland.com>,
Alexander Duyck <alexander.h.duyck@...el.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [RFC PATCH] net: Introduce a socket option to enable picking tx
queue based on rx queue.
On Tue, 2017-09-19 at 21:59 -0700, Samudrala, Sridhar wrote:
> On 9/19/2017 5:48 PM, Tom Herbert wrote:
> > On Tue, Sep 19, 2017 at 5:34 PM, Samudrala, Sridhar
> > <sridhar.samudrala@...el.com> wrote:
> > > On 9/12/2017 3:53 PM, Tom Herbert wrote:
> > > > On Tue, Sep 12, 2017 at 3:31 PM, Samudrala, Sridhar
> > > > <sridhar.samudrala@...el.com> wrote:
> > > > >
> > > > > On 9/12/2017 8:47 AM, Eric Dumazet wrote:
> > > > > > On Mon, 2017-09-11 at 23:27 -0700, Samudrala, Sridhar wrote:
> > > > > > > On 9/11/2017 8:53 PM, Eric Dumazet wrote:
> > > > > > > > On Mon, 2017-09-11 at 20:12 -0700, Tom Herbert wrote:
> > > > > > > >
> > > > > > > > > Two ints in sock_common for this purpose is quite expensive and the
> > > > > > > > > use case for this is limited-- even if a RX->TX queue mapping were
> > > > > > > > > introduced to eliminate the queue pair assumption this still won't
> > > > > > > > > help if the receive and transmit interfaces are different for the
> > > > > > > > > connection. I think we really need to see some very compelling
> > > > > > > > > results
> > > > > > > > > to be able to justify this.
> > > > > > > Will try to collect and post some perf data with symmetric queue
> > > > > > > configuration.
> > >
> > > Here is some performance data i collected with memcached workload over
> > > ixgbe 10Gb NIC with mcblaster benchmark.
> > > ixgbe is configured with 16 queues and rx-usecs is set to 1000 for a very
> > > low
> > > interrupt rate.
> > > ethtool -L p1p1 combined 16
> > > ethtool -C p1p1 rx-usecs 1000
> > > and busy poll is set to 1000usecs
> > > sysctl net.core.busy_poll = 1000
> > >
> > > 16 threads 800K requests/sec
> > > =============================
> > > rtt(min/avg/max)usecs intr/sec contextswitch/sec
> > > -----------------------------------------------------------------------
> > > Default 2/182/10641 23391 61163
> > > Symmetric Queues 2/50/6311 20457 32843
> > >
> > > 32 threads 800K requests/sec
> > > =============================
> > > rtt(min/avg/max)usecs intr/sec contextswitch/sec
> > > ------------------------------------------------------------------------
> > > Default 2/162/6390 32168 69450
> > > Symmetric Queues 2/50/3853 35044 35847
> > >
> > No idea what "Default" configuration is. Please report how xps_cpus is
> > being set, how many RSS queues there are, and what the mapping is
> > between RSS queues and CPUs and shared caches. Also, whether and
> > threads are pinned.
> Default is linux 4.13 with the settings i listed above.
> ethtool -L p1p1 combined 16
> ethtool -C p1p1 rx-usecs 1000
> sysctl net.core.busy_poll = 1000
>
> # ethtool -x p1p1
> RX flow hash indirection table for p1p1 with 16 RX ring(s):
> 0: 0 1 2 3 4 5 6 7
> 8: 8 9 10 11 12 13 14 15
> 16: 0 1 2 3 4 5 6 7
> 24: 8 9 10 11 12 13 14 15
> 32: 0 1 2 3 4 5 6 7
> 40: 8 9 10 11 12 13 14 15
> 48: 0 1 2 3 4 5 6 7
> 56: 8 9 10 11 12 13 14 15
> 64: 0 1 2 3 4 5 6 7
> 72: 8 9 10 11 12 13 14 15
> 80: 0 1 2 3 4 5 6 7
> 88: 8 9 10 11 12 13 14 15
> 96: 0 1 2 3 4 5 6 7
> 104: 8 9 10 11 12 13 14 15
> 112: 0 1 2 3 4 5 6 7
> 120: 8 9 10 11 12 13 14 15
>
> smp_affinity for the 16 queuepairs
> 141 p1p1-TxRx-0 0000,00000001
> 142 p1p1-TxRx-1 0000,00000002
> 143 p1p1-TxRx-2 0000,00000004
> 144 p1p1-TxRx-3 0000,00000008
> 145 p1p1-TxRx-4 0000,00000010
> 146 p1p1-TxRx-5 0000,00000020
> 147 p1p1-TxRx-6 0000,00000040
> 148 p1p1-TxRx-7 0000,00000080
> 149 p1p1-TxRx-8 0000,00000100
> 150 p1p1-TxRx-9 0000,00000200
> 151 p1p1-TxRx-10 0000,00000400
> 152 p1p1-TxRx-11 0000,00000800
> 153 p1p1-TxRx-12 0000,00001000
> 154 p1p1-TxRx-13 0000,00002000
> 155 p1p1-TxRx-14 0000,00004000
> 156 p1p1-TxRx-15 0000,00008000
> xps_cpus for the 16 Tx queues
> 0000,00000001
> 0000,00000002
> 0000,00000004
> 0000,00000008
> 0000,00000010
> 0000,00000020
> 0000,00000040
> 0000,00000080
> 0000,00000100
> 0000,00000200
> 0000,00000400
> 0000,00000800
> 0000,00001000
> 0000,00002000
> 0000,00004000
> 0000,00008000
> memcached threads are not pinned.
>
...
I urge you to take the time to properly tune this host.
linux kernel does not do automagic configuration. This is user policy.
Documentation/networking/scaling.txt has everything you need.
Powered by blists - more mailing lists