lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 19 Sep 2017 22:13:47 -0700
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     "Samudrala, Sridhar" <sridhar.samudrala@...el.com>
Cc:     Tom Herbert <tom@...bertland.com>,
        Alexander Duyck <alexander.h.duyck@...el.com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [RFC PATCH] net: Introduce a socket option to enable picking tx
 queue based on rx queue.

On Tue, 2017-09-19 at 21:59 -0700, Samudrala, Sridhar wrote:
> On 9/19/2017 5:48 PM, Tom Herbert wrote:
> > On Tue, Sep 19, 2017 at 5:34 PM, Samudrala, Sridhar
> > <sridhar.samudrala@...el.com> wrote:
> > > On 9/12/2017 3:53 PM, Tom Herbert wrote:
> > > > On Tue, Sep 12, 2017 at 3:31 PM, Samudrala, Sridhar
> > > > <sridhar.samudrala@...el.com> wrote:
> > > > > 
> > > > > On 9/12/2017 8:47 AM, Eric Dumazet wrote:
> > > > > > On Mon, 2017-09-11 at 23:27 -0700, Samudrala, Sridhar wrote:
> > > > > > > On 9/11/2017 8:53 PM, Eric Dumazet wrote:
> > > > > > > > On Mon, 2017-09-11 at 20:12 -0700, Tom Herbert wrote:
> > > > > > > > 
> > > > > > > > > Two ints in sock_common for this purpose is quite expensive and the
> > > > > > > > > use case for this is limited-- even if a RX->TX queue mapping were
> > > > > > > > > introduced to eliminate the queue pair assumption this still won't
> > > > > > > > > help if the receive and transmit interfaces are different for the
> > > > > > > > > connection. I think we really need to see some very compelling
> > > > > > > > > results
> > > > > > > > > to be able to justify this.
> > > > > > > Will try to collect and post some perf data with symmetric queue
> > > > > > > configuration.
> > > 
> > > Here is some performance data i collected with memcached workload over
> > > ixgbe 10Gb NIC with mcblaster benchmark.
> > > ixgbe is configured with 16 queues and rx-usecs is set to 1000 for a very
> > > low
> > > interrupt rate.
> > >       ethtool -L p1p1 combined 16
> > >       ethtool -C p1p1 rx-usecs 1000
> > > and busy poll is set to 1000usecs
> > >       sysctl net.core.busy_poll = 1000
> > > 
> > > 16 threads  800K requests/sec
> > > =============================
> > >                   rtt(min/avg/max)usecs     intr/sec contextswitch/sec
> > > -----------------------------------------------------------------------
> > > Default                2/182/10641            23391 61163
> > > Symmetric Queues       2/50/6311              20457 32843
> > > 
> > > 32 threads  800K requests/sec
> > > =============================
> > >                  rtt(min/avg/max)usecs     intr/sec contextswitch/sec
> > > ------------------------------------------------------------------------
> > > Default                2/162/6390            32168 69450
> > > Symmetric Queues        2/50/3853            35044 35847
> > > 
> > No idea what "Default" configuration is. Please report how xps_cpus is
> > being set, how many RSS queues there are, and what the mapping is
> > between RSS queues and CPUs and shared caches. Also, whether and
> > threads are pinned.
> Default is linux 4.13 with the settings i listed above.    
>         ethtool -L p1p1 combined 16
>         ethtool -C p1p1 rx-usecs 1000
>         sysctl net.core.busy_poll = 1000
> 
> # ethtool -x p1p1
> RX flow hash indirection table for p1p1 with 16 RX ring(s):
>     0:      0     1     2     3     4     5     6     7
>     8:      8     9    10    11    12    13    14    15
>    16:      0     1     2     3     4     5     6     7
>    24:      8     9    10    11    12    13    14    15
>    32:      0     1     2     3     4     5     6     7
>    40:      8     9    10    11    12    13    14    15
>    48:      0     1     2     3     4     5     6     7
>    56:      8     9    10    11    12    13    14    15
>    64:      0     1     2     3     4     5     6     7
>    72:      8     9    10    11    12    13    14    15
>    80:      0     1     2     3     4     5     6     7
>    88:      8     9    10    11    12    13    14    15
>    96:      0     1     2     3     4     5     6     7
>   104:      8     9    10    11    12    13    14    15
>   112:      0     1     2     3     4     5     6     7
>   120:      8     9    10    11    12    13    14    15
> 
> smp_affinity for the 16 queuepairs
>         141 p1p1-TxRx-0 0000,00000001
>         142 p1p1-TxRx-1 0000,00000002
>         143 p1p1-TxRx-2 0000,00000004
>         144 p1p1-TxRx-3 0000,00000008
>         145 p1p1-TxRx-4 0000,00000010
>         146 p1p1-TxRx-5 0000,00000020
>         147 p1p1-TxRx-6 0000,00000040
>         148 p1p1-TxRx-7 0000,00000080
>         149 p1p1-TxRx-8 0000,00000100
>         150 p1p1-TxRx-9 0000,00000200
>         151 p1p1-TxRx-10 0000,00000400
>         152 p1p1-TxRx-11 0000,00000800
>         153 p1p1-TxRx-12 0000,00001000
>         154 p1p1-TxRx-13 0000,00002000
>         155 p1p1-TxRx-14 0000,00004000
>         156 p1p1-TxRx-15 0000,00008000
> xps_cpus for the 16 Tx queues
>         0000,00000001
>         0000,00000002
>         0000,00000004
>         0000,00000008
>         0000,00000010
>         0000,00000020
>         0000,00000040
>         0000,00000080
>         0000,00000100
>         0000,00000200
>         0000,00000400
>         0000,00000800
>         0000,00001000
>         0000,00002000
>         0000,00004000
>         0000,00008000
> memcached threads are not pinned.
> 

...

I urge you to take the time to properly tune this host.

linux kernel does not do automagic configuration. This is user policy.

Documentation/networking/scaling.txt has everything you need.




Powered by blists - more mailing lists