lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F6CC866.1090602@hp.com>
Date:	Fri, 23 Mar 2012 12:00:54 -0700
From:	Rick Jones <rick.jones2@...com>
To:	Thomas Lendacky <tahm@...ux.vnet.ibm.com>
CC:	Shirley Ma <mashirle@...ibm.com>,
	"Michael S. Tsirkin" <mst@...hat.com>, netdev@...r.kernel.org,
	kvm@...r.kernel.org
Subject: Re: [RFC PATCH 1/1] NUMA aware scheduling per cpu vhost thread

On 03/23/2012 11:32 AM, Thomas Lendacky wrote:
> I ran a series of TCP_RR, UDP_RR, TCP_STREAM and TCP_MAERTS tests
> against the recent vhost patches. For simplicity, the patches
> submitted by Anthony that increase the number of threads per vhost
> instance I will call multi-worker and the patches submitted by Shirley
> that provide a vhost thread per cpu I will call per-cpu.

Lots of nice data there - kudos.

> Quick description of the tests:
>    TCP_RR and UDP_RR using 256 byte request/response size in 1, 10, 30
>    and 60 instances

There is a point, not quite sure where, when aggregate, synchronous 
single-transaction netperf tests become as much a context switching test 
as a networking test.  That is why netperf RR has support for the "burst 
mode" to have more than one transaction in flight at one time:

http://www.netperf.org/svn/netperf2/trunk/doc/netperf.html#Using-_002d_002denable_002dburst

When looking to measure packet/transaction per second scaling I've taken 
to finding the peak for a single stream by running up the burst size, 
(TCP_NODELAY set) and then running 1, 2, 4 etc of those streams. With 
the occasional ethtool -S audit to make sure that each TCP_RR 
transaction is indeed a discrete pair of TCP segments...

In addition to avoiding concerns about becoming a context switching 
exercise, the reduction in netperf instances means less chance for skew 
error on startup and shutdown.  To address that I've somewhat recently 
taken to using demo mode in netperf and then post-processing the results 
through rrdtool:

http://www.netperf.org/svn/netperf2/trunk/doc/netperf.html#Using-_002d_002denable_002ddemo

I have a "one to many" script for that under:

http://www.netperf.org/svn/netperf2/trunk/doc/examples/runemomniaggdemo.sh

which is then post-processed via some stone knives and bearskins:
http://www.netperf.org/svn/netperf2/trunk/doc/examples/post_proc.sh
http://www.netperf.org/svn/netperf2/trunk/doc/examples/vrules.awk
http://www.netperf.org/svn/netperf2/trunk/doc/examples/mins_maxes.awk

I've also used that basic idea in some many to many tests involving 512 
concurrent netperf instances but that script isn't up on netperf.org.

>    TCP_STREAM and TCP_MAERTS using 256, 1K, 4K and 16K message sizes
>    and 1 and 4 instances

Netperf's own documentation and output is probably not good on this 
point (feel free to loose petards, though some instances may be cast in 
stone) but those aren't really message sizes.  They are simply the 
quantity of data netperf is presenting to the transport in any one send 
call.  They are send sizes.

>    Remote host to VM using 1, 4, 12 and 24 VMs (2 vCPUs) with the tests
>    running between an external host and each VM.

I suppose it is implicit, and I'm just being pedantic/paranoid but you 
are confident of the limits of the external host?

>    Local VM to VM using 2, 4, 12 and 24 VMs (2 vCPUs) with the tests
>    running between VM pairs on the same host (no TCP_MAERTS done in
>    this situation).
>
> For TCP_RR and UDP_RR tests I report the transaction rate as the
> score and the transaction rate / KVMhost CPU% as the efficiency.
>
> For TCP_STREAM and TCP_MAERTS tests I report the throughput in Mbps
> as the score and the throughput / KVMhost CPU% as the efficiency.
>
> The KVM host machine is a nehalem-based 2-socket, 4-cores/socket
> system (E5530 @ 2.40GHz) with hyperthreading disabled and an Intel
> 10GbE single port network adapter.
>
> There's a lot of data and I hope this is the clearest way to report
> it.  The remote host to VM results are first followed by the local
> VM to VM results.

Looks reasonable as far as presentation goes.  Might have included a 
summary table of the various peaks:

TCP_RR Remote Host to VM:
         Inst     -   Base    -  -Multi-Worker- -  Per-CPU  -
     VMs  /VM    Score   Eff    Score   Eff    Score   Eff
       1      60 117,448 3,929  148,330 3,616  137,996 3,898
       4      60 308,838 3,555  170,486 1,738  285,073 2,988
      12      60 156,868 1,574  152,205 1,527  223,701 2,250
      24      60 144,684 1,457  146,788 1,468  240,963 2,513

Given the KVM host machine is 8 cores with hyperthreading disabled, I 
might have included a data point at 8 VMs even if they were 2 vCPU VMs, 
but that is just my gut talking.  Certainly looking at the summary table 
I'm wondering where between 4 and 12 VMs the curve starts its downward 
trend.  Does 12 and 24, 2vCPU VMs force moving around more than say 16 
or 32 would?

happy benchmarking,

rick jones

>
>
> Remote Host to VM:
>   Host to 1 VM
>                  -   Base    -  -Multi-Worker- -  Per-CPU  -
>    Test     Inst   Score   Eff    Score   Eff    Score   Eff
>    TCP_RR      1   9,587   984    9,725 1,145    9,252 1,041
>               10  63,919 3,095   51,841 2,415   55,226 2,884
>               30  85,646 3,288  127,277 3,242  145,644 4,092
>               60 117,448 3,929  148,330 3,616  137,996 3,898
>
>    UDP_RR      1  10,815 1,174   10,125 1,255    7,913 1,150
>               10  53,989 3,082   59,590 2,875   52,353 3,328
>               30  91,484 4,115   95,312 3,042  110,715 3,659
>               60 107,466 4,689  173,443 4,351  158,141 4,235
>
>    TCP_STREAM
>           256  1   2,724   140    2,450   131    2,681   150
>                4   5,027   137    4,147   146    3,998   117
>
>          1024  1   5,602   235    4,623   169    5,425   238
>                4   5,987   212    5,991   133    6,827   175
>
>          4096  1   6,202   256    6,753   211    7,247   279
>                4   4,996   192    5,771   159    7,124   202
>
>         16384  1   6,258   259    7,211   214    8,453   308
>                4   4,591   179    5,788   181    6,925   217
>
>    TCP_MAERTS
>           256  1   1,951    85    1,871    89    1,899    97
>                4   4,757   129    4,102   140    4,279   116
>
>          1024  1   7,479   381    6,970   371    7,374   427
>                4   8,931   385    6,612   258    8,731   417
>
>          4096  1   9,276   464    9,296   456    9,131   510
>                4   9,381   452    9,032   367    9,338   446
>
>         16384  1   9,153   496    8,817   589    9,238   516
>                4   9,358   478    9,006   367    9,350   462
>
>   Host to 1 VM (VM pinned to a socket)
>                  -   Base    -  -Multi-Worker- -  Per-CPU  -
>    Test     Inst   Score   Eff    Score   Eff    Score   Eff
>    TCP_RR      1   9,992 1,019    9,899   917    8,963   899
>               10  60,731 3,236   60,015 2,444   55,860 3,059
>               30 127,375 4,042  146,571 3,922  163,806 4,389
>               60 173,021 4,972  149,549 4,662  161,397 4,330
>
>    UDP_RR      1  10,854 1,253    7,983 1,120    7,647 1,206
>               10  68,128 3,804   64,335 4,067   53,343 3,233
>               30  92,456 3,994  112,101 4,219  111,610 3,598
>               60 135,741 4,590  184,441 4,422  184,527 4,546
>
>    TCP_STREAM
>           256  1   2,564   146    2,530   147    2,497   150
>                4   4,757   139    4,300   127    4,245   124
>
>          1024  1   4,700   209    6,062   323    5,627   247
>                4   6,828   214    7,125   153    6,561   172
>
>          4096  1   6,676   281    7,672   286    7,760   290
>                4   6,258   236    6,410   171    7,354   225
>
>         16384  1   6,712   289    8,217   297    8,457   322
>                4   5,764   235    6,285   200    7,554   245
>
>    TCP_MAERTS
>           256  1   1,673    82    1,444    71    1,756    88
>                4   6,385   175    5,671   155    5,685   153
>
>          1024  1   7,500   427    6,884   414    7,640   429
>                4   9,310   444    8,659   496    8,200   350
>
>          4096  1   8,427   477    9,201   515    8,825   422
>                4   9,372   478    9,184   394    9,391   446
>
>         16384  1   8,840   500    9,205   555    9,239   482
>                4   9,379   495    9,079   385    9,389   472
>
>   Host to 4 VMs
>                  -   Base    -  -Multi-Worker- -  Per-CPU  -
>    Test     Inst   Score   Eff    Score   Eff    Score   Eff
>    TCP_RR      1  38,635   949   34,063   843   35,432   897
>               10 193,703 2,604  157,699 1,841  180,323 2,858
>               30 279,736 3,301  170,343 1,739  269,827 2,875
>               60 308,838 3,555  170,486 1,738  285,073 2,988
>
>    UDP_RR      1  42,209 1,136   36,035   904   36,974   975
>               10 177,286 2,616  166,999 2,043  178,470 2,466
>               30 296,415 3,731  221,738 2,488  260,630 2,966
>               60 353,784 4,179  209,489 2,152  306,792 3,440
>
>    TCP_STREAM
>           256  1   8,409   113    7,517   101    7,178   115
>                4   8,963    93    7,825    80    8,606    91
>
>          1024  1   9,382   119   10,223   192    9,314   128
>                4   9,233   101    9,085   110    8,585   105
>
>          4096  1   9,391   124    9,393   125    9,300   140
>                4   9,303   103    9,151   102    8,601   106
>
>         16384  1   9,395   121    8,715   128    9,378   135
>                4   9,322   105    9,135   101    8,691   121
>
>    TCP_MAERTS
>           256  1   8,629   125    7,045   112    7,559   109
>                4   9,389   145    7,091    80    9,335   156
>
>          1024  1   9,385   201    9,349   148    9,320   248
>                4   9,392   154    9,340   148    9,390   226
>
>          4096  1   9,387   239    9,339   151    9,379   291
>                4   9,392   167    9,389   124    9,390   259
>
>         16384  1   9,374   236    9,366   150    9,391   317
>                4   9,365   167    9,394   123    9,390   284
>
>   Host to 12 VMs
>                  -   Base    -  -Multi-Worker- -  Per-CPU  -
>    Test     Inst   Score   Eff    Score   Eff    Score   Eff
>    TCP_RR      1  79,628   928   85,717   944   72,760   885
>               10 106,348 1,067   94,032   944  164,548 2,017
>               30 131,313 1,318  116,431 1,168  206,560 2,367
>               60 156,868 1,574  152,205 1,527  223,701 2,250
>
>    UDP_RR      1  90,762 1,059   93,904 1,037   75,512   919
>               10 149,381 1,499  113,254 1,136  194,153 1,951
>               30 177,803 1,783  132,818 1,333  235,682 2,370
>               60 201,833 2,025  154,871 1,554  258,133 2,595
>
>    TCP_STREAM
>           256  1   8,549    86    7,173    72    8,407    85
>                4   8,910    89    8,693    87    8,768    88
>
>          1024  1   9,397    95    9,371    94    9,376    95
>                4   9,289    93    9,268   100    8,898    92
>
>          4096  1   9,399    95    9,415    95    9,401    97
>                4   9,336    94    9,319    94    8,938    94
>
>         16384  1   9,405    95    9,402    96    9,397   102
>                4   9,366    94    9,345    94    8,890    94
>
>    TCP_MAERTS
>           256  1   4,646    49    2,273    23    9,232   135
>                4   9,393   107    8,019    81    9,414   134
>
>          1024  1   9,393   115    9,403   104    9,399   178
>                4   9,406   110    9,383    98    9,392   157
>
>          4096  1   9,393   114    9,409   104    9,388   202
>                4   9,388   110    9,387    98    9,382   181
>
>         16384  1   9,396   114    9,391   104    9,394   221
>                4   9,411   110    9,384    98    9,391   192
>
>   Host to 24 VMs
>                  -   Base    -  -Multi-Worker- -  Per-CPU  -
>    Test     Inst   Score   Eff    Score   Eff    Score   Eff
>    TCP_RR      1 110,139 1,118  101,765 1,033   79,189   805
>               10  94,757   948   90,872   915  156,821 1,581
>               30 119,904 1,199  120,728 1,207  214,151 2,211
>               60 144,684 1,457  146,788 1,468  240,963 2,513
>
>    UDP_RR      1 129,655 1,316  120,071 1,201   91,208   914
>               10 119,204 1,201  104,645 1,046  208,432 2,340
>               30 158,887 1,601  136,629 1,366  249,329 2,517
>               60 179,365 1,794  159,883 1,610  259,018 2,651
>
>    TCP_STREAM
>           256  1   5,899    59    4,258    44    8,071    82
>                4   8,739    89    8,195    83    7,934    82
>
>          1024  1   8,477    86    7,498    76    9,268    93
>                4   9,205    93    9,171    94    8,159    84
>
>          4096  1   9,334    96    8,992    92    9,324    97
>                4   9,255    95    9,221    92    8,237    85
>
>         16384  1   9,373    96    9,356    95    9,311    96
>                4   9,283    94    9,275    93    8,317    86
>
>    TCP_MAERTS
>           256  1     739     7      770     8    9,186   129
>                4   7,804    79    7,573    76    9,253   122
>
>          1024  1   1,763    18    1,759    18    9,287   146
>                4   9,204    99    9,166    93    9,389   155
>
>          4096  1   3,430    35    3,403    35    9,348   161
>                4   9,372   100    9,315    95    9,385   151
>
>         16384  1   9,309   102    9,306    97    9,353   175
>                4   9,378   100    9,392    96    9,377   159
>
>
>
> Local VM to VM:
>
>   1 VM to 1 VM
>                  -   Base    -  -Multi-Worker- -  Per-CPU  -
>    Test     Inst   Score   Eff    Score   Eff    Score   Eff
>    TCP_RR      1   7,422   506    7,698   462    6,281   450
>               10  49,662 1,362   47,553 1,205   43,258 1,270
>               30  91,657 1,538   99,319 1,471   89,478 1,499
>               60 106,168 1,658  106,430 1,503   99,205 1,576
>
>    UDP_RR      1   8,414   552    8,532   528    6,976   499
>               10  58,359 1,645   55,283 1,398   48,094 1,457
>               30  91,046 1,736  109,403 1,721   92,109 1,715
>               60 128,835 2,021  130,382 1,807  118,563 1,853
>
>    TCP_STREAM
>           256  1   2,029    60    1,923    54    1,998    64
>                4   3,861    66    3,445    53    2,914    54
>
>          1024  1   7,374   205    6,465   174    5,704   165
>                4   8,474   196    7,541   161    6,274   156
>
>          4096  1  12,825   295   11,921   275   10,262   262
>                4  12,639   253   13,395   260   11,451   264
>
>         16384  1  14,576   331   14,141   291   11,925   305
>                4  16,016   327   14,210   274   13,656   308
>
>
>   1 VM to 1 VM (each VM pinned to a socket)
>                  -   Base    -  -Multi-Worker- -  Per-CPU  -
>    Test     Inst   Score   Eff    Score   Eff    Score   Eff
>    TCP_RR      1   7,145   489    7,840   477    5,965   467
>               10  51,016 1,406   47,881 1,223   45,232 1,288
>               30  92,785 1,580  103,453 1,512   91,437 1,523
>               60 120,160 1,817  115,058 1,595  102,734 1,611
>
>    UDP_RR      1   7,908   547    8,704   541    6,552   528
>               10  59,807 1,653   56,598 1,435   50,524 1,488
>               30  90,302 1,738  113,861 1,765   94,640 1,720
>               60 141,684 2,196  141,866 1,919  125,334 1,917
>
>    TCP_STREAM
>           256  1   2,210    64    1,291    32    2,069    64
>                4   3,993    64    3,441    52    2,780    50
>
>          1024  1   8,106   217    7,571   198    5,709   165
>                4   8,471   206    8,756   174    6,531   157
>
>          4096  1  15,360   350   13,825   303   10,717   271
>                4  14,671   330   12,604   263   11,266   258
>
>         16384  1  18,284   395   16,305   337   13,185   317
>                4  15,451   331   12,438   247   14,699   316
>
>
>   2 VMs to 2 VMs (4 VMs total)
>                  -   Base    -  -Multi-Worker- -  Per-CPU  -
>    Test     Inst   Score   Eff    Score   Eff    Score   Eff
>    TCP_RR      1  15,498   491   16,518   460   13,008   441
>               10  71,425   983   79,711 1,063   85,087 1,037
>               30 102,132 1,436   82,191 1,145  100,504 1,076
>               60 127,670 1,608   96,815 1,262  104,694 1,119
>
>    UDP_RR      1  17,091   548   18,214   538   14,780   492
>               10  77,682 1,129   87,523 1,235   86,755 1,165
>               30 131,830 1,826   92,844 1,327  111,839 1,232
>               60 145,688 1,952  111,315 1,520  116,358 1,296
>
>    TCP_STREAM
>           256  1   5,085    72    3,900    50    2,430    38
>                4   6,622    70    4,337    48    5,032    58
>
>          1024  1  15,262   206   15,022   195    7,000   115
>                4  14,205   174   15,288   174   11,030   148
>
>          4096  1  15,020   197   21,694   261   13,583   198
>                4  16,818   205   16,076   195   17,175   238
>
>         16384  1  19,671   261   23,699   290   22,396   306
>                4  18,648   229   17,901   218   17,122   251
>
>   6 VMs to 6 VMs (12 VMs total)
>                  -   Base    -  -Multi-Worker- -  Per-CPU  -
>    Test     Inst   Score   Eff    Score   Eff    Score   Eff
>    TCP_RR      1  30,242   400   32,281   390   27,737   401
>               10  73,461   783   61,856   644   93,259 1,000
>               30  98,638 1,034   81,799   844  107,022 1,121
>               60 114,238 1,200   91,772   944  110,839 1,152
>
>    UDP_RR      1  33,017   438   35,540   429   30,022   438
>               10  84,676   910   67,838   711  112,339 1,220
>               30 110,799 1,156   90,555   932  128,928 1,357
>               60 129,679 1,354  100,715 1,033  136,503 1,429
>
>    TCP_STREAM
>           256  1   6,947    72    5,380    56    6,138    72
>                4   8,400    85    7,660    77    8,893    89
>
>          1024  1  13,698   146   10,307   108   13,023   158
>                4  15,391   157   13,242   135   17,264   182
>
>          4096  1  18,928   202   14,580   154   16,970   189
>                4  18,826   191   17,262   175   19,558   212
>
>         16384  1  22,176   234   17,716   187   21,245   243
>                4  21,306   215   20,332   206   18,353   227
>
>   12 VMs to 12 VMs (24 VMs total)
>                  -   Base    -  -Multi-Worker- -  Per-CPU  -
>    Test     Inst   Score   Eff    Score   Eff    Score   Eff
>    TCP_RR      1  72,926   731   67,338   675   32,662   387
>               10  62,441   625   59,277   594   87,286   891
>               30  72,761   728   67,760   679  102,549 1,041
>               60  78,087   782   74,654   748  100,687 1,016
>
>    UDP_RR      1  82,662   829   80,875   810   34,915   421
>               10  71,424   716   67,754   679  111,753 1,147
>               30  79,495   796   75,512   756  134,576 1,372
>               60  83,339   835   77,523   778  137,058 1,390
>
>    TCP_STREAM
>           256  1   2,870    29    2,631    26    7,907    80
>                4   8,424    84    8,026    80    8,929    90
>
>          1024  1   3,674    37    3,121    31   15,644   164
>                4  14,256   143   13,342   134   16,116   168
>
>          4096  1   5,068    51    4,366    44   16,179   168
>                4  17,015   171   16,321   164   17,940   186
>
>         16384  1   9,768    98    9,025    90   19,233   203
>                4  18,981   190   18,202   183   18,964   203
>
>
> On Thursday, March 22, 2012 05:16:30 PM Shirley Ma wrote:
>> Resubmit it with the right format.
>>
>> Signed-off-by: Shirley Ma<xma@...ibm.com>
>> Signed-off-by: Krishna Kumar<krkumar2@...ibm.com>
>> Tested-by: Tom Lendacky<toml@...ibm.com>
>> ---
>>
>>   drivers/vhost/net.c                  |   26 ++-
>>   drivers/vhost/vhost.c                |  300
>> ++++++++++++++++++++++++---------- drivers/vhost/vhost.h                |
>> 16 ++-
>>   3 files changed, 243 insertions(+), 103 deletions(-)
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ