lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A4DE3C1.5080307@vlnb.net>
Date:	Fri, 03 Jul 2009 14:56:01 +0400
From:	Vladislav Bolkhovitin <vst@...b.net>
To:	Ronald Moesbergen <intercommit@...il.com>
CC:	Wu Fengguang <fengguang.wu@...el.com>, linux-kernel@...r.kernel.org
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev


Ronald Moesbergen, on 07/03/2009 01:14 PM wrote:
>>> OK, now I tend to agree on decreasing max_sectors_kb and increasing
>>> read_ahead_kb. But before actually trying to push that idea I'd like
>>> to
>>> - do more benchmarks
>>> - figure out why context readahead didn't help SCST performance
>>>  (previous traces show that context readahead is submitting perfect
>>>   large io requests, so I wonder if it's some io scheduler bug)
>> Because, as we found out, without your http://lkml.org/lkml/2009/5/21/319
>> patch read-ahead was nearly disabled, hence there were no difference which
>> algorithm was used?
>>
>> Ronald, can you run the following tests, please? This time with 2 hosts,
>> initiator (client) and target (server) connected using 1 Gbps iSCSI. It
>> would be the best if on the client vanilla 2.6.29 will be ran, but any other
>> kernel will be fine as well, only specify which. Blockdev-perftest should be
>> ran as before in buffered mode, i.e. with "-a" switch.
>>
>> 1. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
>> http://lkml.org/lkml/2009/5/21/319 patch with all default settings.
>>
>> 2. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
>> http://lkml.org/lkml/2009/5/21/319 patch with default RA size and 64KB
>> max_sectors_kb.
>>
>> 3. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and default
>> max_sectors_kb.
>>
>> 4. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and 64KB
>> max_sectors_kb.
>>
>> 5. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
>> http://lkml.org/lkml/2009/5/21/319 patch and with context RA patch. RA size
>> and max_sectors_kb are default. For your convenience I committed the
>> backported context RA patches into the SCST SVN repository.
>>
>> 6. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with default RA
>> size and 64KB max_sectors_kb.
>>
>> 7. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA size
>> and default max_sectors_kb.
>>
>> 8. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA size
>> and 64KB max_sectors_kb.
>>
>> 9. On the client default RA size and 64KB max_sectors_kb. On the server
>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>
>> 10. On the client 2MB RA size and default max_sectors_kb. On the server
>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>
>> 11. On the client 2MB RA size and 64KB max_sectors_kb. On the server vanilla
>> 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and context RA
>> patches with 2MB RA size and 64KB max_sectors_kb.
> 
> Ok, done. Performance is pretty bad overall :(
> 
> The kernels I used:
> client kernel: 2.6.26-15lenny3 (debian)
> server kernel: 2.6.29.5 with blk_dev_run patch
> 
> And I adjusted the blockdev-perftest script to drop caches on both the
> server (via ssh) and the client.
> 
> The results:
> 
> 1) client: default, server: default
> blocksize       R        R        R   R(avg,    R(std        R
>   (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS)
>  67108864  19.808   20.078   20.180   51.147    0.402    0.799
>  33554432  19.162   19.952   20.375   51.673    1.322    1.615
>  16777216  19.714   20.331   19.948   51.214    0.649    3.201
>   8388608  18.572   20.126   20.345   52.116    2.149    6.515
>   4194304  18.711   19.663   19.811   52.831    1.350   13.208
>   2097152  19.112   19.927   19.130   52.832    1.022   26.416
>   1048576  19.771   19.686   20.010   51.661    0.356   51.661
>    524288  19.585   19.940   19.483   52.065    0.515  104.131
>    262144  19.168   20.794   19.605   51.634    1.757  206.535
>    131072  19.077   20.776   20.271   51.160    1.849  409.282
>     65536  19.643   21.230   19.144   51.284    2.227  820.549
>     32768  19.702   20.869   19.686   51.020    1.380 1632.635
>     16384  21.218   20.222   20.221   49.846    1.121 3190.174
> 
> 2) client: default, server: 64 max_sectors_kb
> blocksize       R        R        R   R(avg,    R(std        R
>   (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS)
>  67108864  20.881   20.102   21.689   49.065    1.522    0.767
>  33554432  20.329   19.938   20.522   50.543    0.609    1.579
>  16777216  20.247   19.744   20.912   50.468    1.185    3.154
>   8388608  19.739   20.184   21.032   50.433    1.318    6.304
>   4194304  19.968   18.748   20.230   52.174    1.750   13.043
>   2097152  19.633   20.068   19.858   51.584    0.462   25.792
>   1048576  20.552   20.618   20.974   49.437    0.440   49.437
>    524288  21.595   20.830   20.454   48.881    1.098   97.762
>    262144  21.720   20.602   20.176   49.201    1.515  196.805
>    131072  20.976   19.089   20.712   50.634    2.144  405.072
>     65536  20.661   19.952   19.312   51.303    1.414  820.854
>     32768  21.155   18.464   20.640   51.159    3.081 1637.090
>     16384  22.023   19.944   20.629   49.159    2.008 3146.205
> 
> 3) client: default, server: default max_sectors_kb, RA 2MB
> blocksize       R        R        R   R(avg,    R(std        R
>   (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS)
>  67108864  21.709   19.315   18.319   52.028    3.631    0.813
>  33554432  20.745   19.209   19.048   52.142    1.976    1.629
>  16777216  19.762   19.175   19.485   52.591    0.649    3.287
>   8388608  19.812   19.142   19.574   52.498    0.749    6.562
>   4194304  19.931   19.786   19.505   51.877    0.466   12.969
>   2097152  19.473   19.208   19.438   52.859    0.322   26.430
>   1048576  19.524   19.033   19.477   52.941    0.610   52.941
>    524288  20.115   20.402   19.542   51.166    0.920  102.333
>    262144  19.291   19.715   21.016   51.249    1.844  204.996
>    131072  18.782   19.130   20.334   52.802    1.775  422.419
>     65536  19.030   19.233   20.328   52.475    1.504  839.599
>     32768  19.147   19.326   19.411   53.074    0.303 1698.357
>     16384  19.573   19.596   20.417   51.575    1.005 3300.788
> 
> 4) client: default, server: 64 max_sectors_kb, RA 2MB
> blocksize       R        R        R   R(avg,    R(std        R
>   (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS)
>  67108864  22.604   21.707   20.721   47.298    1.683    0.739
>  33554432  21.654   20.812   21.162   48.293    0.784    1.509
>  16777216  20.461   19.782   21.160   50.068    1.377    3.129
>   8388608  20.886   20.434   21.512   48.914    1.028    6.114
>   4194304  22.154   20.512   21.433   47.974    1.517   11.993
>   2097152  22.258   20.971   20.738   48.071    1.478   24.035
>   1048576  19.953   21.294   19.662   50.497    1.731   50.497
>    524288  21.577   20.884   20.883   48.509    0.743   97.019
>    262144  20.959   20.749   20.256   49.587    0.712  198.347
>    131072  19.926   21.542   19.634   50.360    2.022  402.877
>     65536  20.973   22.546   20.840   47.793    1.685  764.690
>     32768  20.695   21.031   21.182   48.837    0.476 1562.791
>     16384  20.163   21.112   20.037   50.133    1.159 3208.481
> 
> 5) Server RA-context Patched, client: default, server: default
> blocksize       R        R        R   R(avg,    R(std        R
>   (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS)
>  67108864  19.756   23.647   18.852   49.818    4.717    0.778
>  33554432  18.892   19.727   18.857   53.472    1.106    1.671
>  16777216  18.943   19.255   18.949   53.760    0.409    3.360
>   8388608  18.766   19.105   18.847   54.165    0.413    6.771
>   4194304  19.177   19.609   20.191   52.111    1.097   13.028
>   2097152  18.968   19.517   18.862   53.581    0.797   26.790
>   1048576  18.833   19.912   18.626   53.592    1.551   53.592
>    524288  19.128   19.379   19.134   53.298    0.324  106.596
>    262144  18.955   19.328   18.879   53.748    0.550  214.992
>    131072  18.401   19.642   18.928   53.961    1.439  431.691
>     65536  19.366   19.822   18.615   53.182    1.384  850.908
>     32768  19.252   19.229   18.752   53.683    0.653 1717.857
>     16384  21.373   19.507   19.162   51.282    2.415 3282.039
> 
> 6) Server RA-context Patched, client: default, server: 64
> max_sectors_kb, RA default
> blocksize       R        R        R   R(avg,    R(std        R
>   (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS)
>  67108864  22.753   21.071   20.532   47.825    2.061    0.747
>  33554432  20.404   19.239   20.722   50.943    1.644    1.592
>  16777216  20.914   20.114   21.854   48.910    1.655    3.057
>   8388608  19.524   21.932   21.465   48.949    2.510    6.119
>   4194304  20.306   20.809   20.000   50.279    0.820   12.570
>   2097152  20.133   20.194   20.181   50.770    0.066   25.385
>   1048576  19.515   21.593   20.052   50.321    2.128   50.321
>    524288  20.231   20.502   20.299   50.335    0.284  100.670
>    262144  19.620   19.737   19.911   51.834    0.313  207.336
>    131072  20.486   21.138   22.339   48.089    1.711  384.714
>     65536  20.113   18.322   22.247   50.943    4.025  815.088
>     32768  23.341   23.328   20.809   45.659    2.511 1461.089
>     16384  20.962   21.839   23.405   46.496    2.100 2975.773
> 
> 7) Server RA-context Patched, client: default, server: default
> max_sectors_kb, RA 2MB
> blocksize       R        R        R   R(avg,    R(std        R
>   (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS)
>  67108864  19.565   19.028   19.164   53.196    0.627    0.831
>  33554432  19.048   18.401   18.940   54.491    0.828    1.703
>  16777216  18.728   19.330   19.076   53.778    0.699    3.361
>   8388608  19.174   18.710   19.922   53.179    1.368    6.647
>   4194304  19.133   18.514   19.672   53.628    1.331   13.407
>   2097152  18.903   18.547   20.070   53.468    1.782   26.734
>   1048576  19.210   19.204   18.994   53.513    0.282   53.513
>    524288  18.978   18.723   20.839   52.596    2.464  105.192
>    262144  18.912   18.590   18.635   54.726    0.415  218.905
>    131072  18.732   18.578   19.797   53.837    1.505  430.694
>     65536  19.046   18.872   19.318   53.678    0.516  858.852
>     32768  18.490   18.582   20.374   53.583    2.353 1714.661
>     16384  19.138   19.215   20.602   52.167    1.744 3338.700
> 
>  8) Server RA-context Patched, client: default, server: 64
> max_sectors_kb, RA 2MB
> blocksize       R        R        R   R(avg,    R(std        R
>   (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS)
>  67108864  21.029   21.654   21.093   48.177    0.630    0.753
>  33554432  21.174   19.759   20.659   49.918    1.435    1.560
>  16777216  20.385   20.235   22.145   49.026    1.976    3.064
>   8388608  19.053   20.162   20.158   51.778    1.391    6.472
>   4194304  20.123   23.173   20.073   48.696    3.188   12.174
>   2097152  19.401   20.824   20.326   50.778    1.500   25.389
>   1048576  21.821   21.401   21.026   47.825    0.724   47.825
>    524288  21.478   20.742   21.355   48.332    0.742   96.664
>    262144  20.290   20.183   20.980   50.004    0.853  200.015
>    131072  20.299   21.501   20.766   49.127    1.158  393.020
>     65536  21.087   19.340   20.867   50.193    1.959  803.092
>     32768  21.597   21.223   23.504   46.410    2.039 1485.132
>     16384  21.681   21.709   22.944   46.343    1.212 2965.967
> 
> 9) Server RA-context Patched, client: 64 max_sectors_kb, default RA.
> server: 64 max_sectors_kb, RA 2MB
> blocksize       R        R        R   R(avg,    R(std        R
>   (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS)
>  67108864  42.767   40.615   41.188   24.672    0.535    0.386
>  33554432  41.204   42.294   40.514   24.780    0.437    0.774
>  16777216  39.774   42.809   41.804   24.720    0.762    1.545
>   8388608  42.292   41.799   40.386   24.689    0.486    3.086
>   4194304  41.784   39.037   41.830   25.073    0.819    6.268
>   2097152  41.983   41.145   44.115   24.164    0.703   12.082
>   1048576  41.468   43.495   41.640   24.276    0.520   24.276
>    524288  42.631   42.724   41.267   24.267    0.387   48.535
>    262144  41.930   41.954   41.975   24.408    0.011   97.634
>    131072  42.511   41.266   42.835   24.269    0.393  194.154
>     65536  41.307   41.544   40.746   24.857    0.203  397.704
>     32768  42.270   42.728   40.822   24.425    0.478  781.607
>     16384  39.307   40.044   40.259   25.686    0.264 1643.908
>      8192  41.258   40.879   40.969   24.955    0.098 3194.183
> 
> 10) Server RA-context Patched, client: default max_sectors_kb, 2MB RA.
> server: 64 max_sectors_kb, RA 2MB
> blocksize       R        R        R   R(avg,    R(std        R
>   (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS)
>  67108864  26.160   26.878   25.790   38.982    0.666    0.609
>  33554432  25.832   25.362   25.695   39.956    0.309    1.249
>  16777216  26.119   24.769   25.526   40.221    0.876    2.514
>   8388608  25.660   26.257   25.106   39.898    0.730    4.987
>   4194304  26.603   25.404   25.271   39.773    0.910    9.943
>   2097152  26.012   24.815   26.064   39.973    0.914   19.986
>   1048576  25.256   27.073   25.153   39.693    1.323   39.693
>    524288  29.452   28.883   29.146   35.118    0.280   70.236
>    262144  26.559   27.315   26.837   38.067    0.440  152.268
>    131072  25.259   25.794   25.992   39.879    0.483  319.030
>     65536  26.417   25.205   26.177   39.503    0.808  632.047
>     32768  26.453   26.401   25.759   39.083    0.474 1250.669
>     16384  24.701   24.609   25.143   41.265    0.385 2640.945
> 
> 11) Server RA-context Patched, client: 64 max_sectors_kb, 2MB. RA
> server: 64 max_sectors_kb, RA 2MB
> blocksize       R        R        R   R(avg,    R(std        R
>   (bytes)     (s)      (s)      (s)    MB/s)   ,MB/s)   (IOPS)
>  67108864  29.629   31.703   30.407   33.513    0.930    0.524
>  33554432  29.768   29.598   30.717   34.111    0.553    1.066
>  16777216  30.054   30.640   30.102   33.837    0.295    2.115
>   8388608  29.906   29.744   31.394   33.762    0.813    4.220
>   4194304  30.708   30.797   30.418   33.420    0.177    8.355
>   2097152  31.364   29.646   30.712   33.511    0.781   16.755
>   1048576  30.757   30.600   30.470   33.455    0.128   33.455
>    524288  29.715   31.176   29.977   33.822    0.701   67.644
>    262144  30.533   30.218   30.259   33.755    0.155  135.021
>    131072  30.403   32.609   30.651   32.831    1.016  262.645
>     65536  30.846   30.208   32.116   32.993    0.835  527.889
>     32768  30.526   29.794   30.556   33.809    0.397 1081.878
>     16384  31.560   31.532   30.938   32.673    0.301 2091.092

Those are on the server without io_context-2.6.29 and readahead-2.6.29 
patches applied and with CFQ scheduler, correct?

Then we see how reorder of requests caused by many I/O threads 
submitting I/O in separate I/O contexts badly affect performance and no 
RA, especially with default 128KB RA size, can solve it. Less 
max_sectors_kb on the client => more requests it sends at once => more 
reorder on the server => worse throughput. Although, Fengguang, in 
theory, context RA with 2MB RA size should considerably help it, no?

Ronald, can you perform those tests again with both io_context-2.6.29 
and readahead-2.6.29 patches applied on the server, please?

Thanks,
Vlad
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ