[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A5D7794.2070607@vlnb.net>
Date: Wed, 15 Jul 2009 10:30:44 +0400
From: Vladislav Bolkhovitin <vst@...b.net>
To: Ronald Moesbergen <intercommit@...il.com>
CC: fengguang.wu@...el.com, linux-kernel@...r.kernel.org,
akpm@...ux-foundation.org, kosaki.motohiro@...fujitsu.com,
Alan.Brunelle@...com, hifumi.hisashi@....ntt.co.jp,
linux-fsdevel@...r.kernel.org, jens.axboe@...cle.com,
randy.dunlap@...cle.com, Bart Van Assche <bart.vanassche@...il.com>
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev
Vladislav Bolkhovitin, on 07/14/2009 10:52 PM wrote:
> Ronald Moesbergen, on 07/13/2009 04:12 PM wrote:
>> 2009/7/10 Vladislav Bolkhovitin <vst@...b.net>:
>>> Vladislav Bolkhovitin, on 07/10/2009 12:43 PM wrote:
>>>> Ronald Moesbergen, on 07/10/2009 10:32 AM wrote:
>>>>>> I've also long ago noticed that reading data from block devices is
>>>>>> slower
>>>>>> than from files from mounted on those block devices file systems. Can
>>>>>> anybody explain it?
>>>>>>
>>>>>> Looks like this is strangeness #2 which we uncovered in our tests (the
>>>>>> first
>>>>>> one was earlier in this thread why the context RA doesn't work with
>>>>>> cooperative I/O threads as good as it should).
>>>>>>
>>>>>> Can you rerun the same 11 tests over a file on the file system, please?
>>>>> I'll see what I can do. Just te be sure: you want me to run
>>>>> blockdev-perftest on a file on the OCFS2 filesystem which is mounted
>>>>> on the client over iScsi, right?
>>>> Yes, please.
>>> Forgot to mention that you should also configure your backend storage as a
>>> big file on a file system (preferably, XFS) too, not as direct device, like
>>> /dev/vg/db-master.
>> Ok, here are the results:
>>
>> client kernel: 2.6.26-15lenny3 (debian)
>> server kernel: 2.6.29.5 with readahead patch
>>
>> Test done with XFS on both the target and the initiator. This confirms
>> your findings, using files instead of block devices is faster, but
>> only when using the io_context patch.
>
> Seems, correct, except case (2), which is still 10% faster.
>
>> Without io_context patch:
>> 1) client: default, server: default
>> blocksize R R R R(avg, R(std R
>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>> 67108864 18.327 18.327 17.740 56.491 0.872 0.883
>> 33554432 18.662 18.311 18.116 55.772 0.683 1.743
>> 16777216 18.900 18.421 18.312 55.229 0.754 3.452
>> 8388608 18.893 18.533 18.281 55.156 0.743 6.895
>> 4194304 18.512 18.097 18.400 55.850 0.536 13.963
>> 2097152 18.635 18.313 18.676 55.232 0.486 27.616
>> 1048576 18.441 18.264 18.245 55.907 0.267 55.907
>> 524288 17.773 18.669 18.459 55.980 1.184 111.960
>> 262144 18.580 18.758 17.483 56.091 1.767 224.365
>> 131072 17.224 18.333 18.765 56.626 2.067 453.006
>> 65536 18.082 19.223 18.238 55.348 1.483 885.567
>> 32768 17.719 18.293 18.198 56.680 0.795 1813.766
>> 16384 17.872 18.322 17.537 57.192 1.024 3660.273
>>
>> 2) client: default, server: 64 max_sectors_kb, RA default
>> blocksize R R R R(avg, R(std R
>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>> 67108864 18.738 18.435 18.400 55.283 0.451 0.864
>> 33554432 18.046 18.167 17.572 57.128 0.826 1.785
>> 16777216 18.504 18.203 18.377 55.771 0.376 3.486
>> 8388608 22.069 18.554 17.825 53.013 4.766 6.627
>> 4194304 19.211 18.136 18.083 55.465 1.529 13.866
>> 2097152 18.647 17.851 18.511 55.866 1.071 27.933
>> 1048576 19.084 18.177 18.194 55.425 1.249 55.425
>> 524288 18.999 18.553 18.380 54.934 0.763 109.868
>> 262144 18.867 18.273 18.063 55.668 1.020 222.673
>> 131072 17.846 18.966 18.193 55.885 1.412 447.081
>> 65536 18.195 18.616 18.482 55.564 0.530 889.023
>> 32768 17.882 18.841 17.707 56.481 1.525 1807.394
>> 16384 17.073 18.278 17.985 57.646 1.689 3689.369
>>
>> 3) client: default, server: default max_sectors_kb, RA 2MB
>> blocksize R R R R(avg, R(std R
>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>> 67108864 18.658 17.830 19.258 55.162 1.750 0.862
>> 33554432 17.193 18.265 18.517 56.974 1.854 1.780
>> 16777216 17.531 17.681 18.776 56.955 1.720 3.560
>> 8388608 18.234 17.547 18.201 56.926 1.014 7.116
>> 4194304 18.057 17.923 17.901 57.015 0.218 14.254
>> 2097152 18.565 17.739 17.658 56.958 1.277 28.479
>> 1048576 18.393 17.433 17.314 57.851 1.550 57.851
>> 524288 18.939 17.835 18.972 55.152 1.600 110.304
>> 262144 18.562 19.005 18.069 55.240 1.141 220.959
>> 131072 19.574 17.562 18.251 55.576 2.476 444.611
>> 65536 19.117 18.019 17.886 55.882 1.647 894.115
>> 32768 18.237 17.415 17.482 57.842 1.200 1850.933
>> 16384 17.760 18.444 18.055 56.631 0.876 3624.391
>>
>> 4) client: default, server: 64 max_sectors_kb, RA 2MB
>> blocksize R R R R(avg, R(std R
>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>> 67108864 18.368 17.495 18.524 56.520 1.434 0.883
>> 33554432 18.209 17.523 19.146 56.052 2.027 1.752
>> 16777216 18.765 18.053 18.550 55.497 0.903 3.469
>> 8388608 17.878 17.848 18.389 56.778 0.774 7.097
>> 4194304 18.058 17.683 18.567 56.589 1.129 14.147
>> 2097152 18.896 18.384 18.697 54.888 0.623 27.444
>> 1048576 18.505 17.769 17.804 56.826 1.055 56.826
>> 524288 18.319 17.689 17.941 56.955 0.816 113.910
>> 262144 19.227 17.770 18.212 55.704 1.821 222.815
>> 131072 18.738 18.227 17.869 56.044 1.090 448.354
>> 65536 19.319 18.525 18.084 54.969 1.494 879.504
>> 32768 18.321 17.672 17.870 57.047 0.856 1825.495
>> 16384 18.249 17.495 18.146 57.025 1.073 3649.582
>>
>> With io_context patch:
>> 5) client: default, server: default
>> blocksize R R R R(avg, R(std R
>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>> 67108864 12.393 11.925 12.627 83.196 1.989 1.300
>> 33554432 11.844 11.855 12.191 85.610 1.142 2.675
>> 16777216 12.729 12.602 12.068 82.187 1.913 5.137
>> 8388608 12.245 12.060 14.081 80.419 5.469 10.052
>> 4194304 13.224 11.866 12.110 82.763 3.833 20.691
>> 2097152 11.585 12.584 11.755 85.623 3.052 42.811
>> 1048576 12.166 12.144 12.321 83.867 0.539 83.867
>> 524288 12.019 12.148 12.160 84.568 0.448 169.137
>> 262144 12.014 12.378 12.074 84.259 1.095 337.036
>> 131072 11.840 12.068 11.849 85.921 0.756 687.369
>> 65536 12.098 11.803 12.312 84.857 1.470 1357.720
>> 32768 11.852 12.635 11.887 84.529 2.465 2704.931
>> 16384 12.443 13.110 11.881 82.197 3.299 5260.620
>>
>> 6) client: default, server: 64 max_sectors_kb, RA default
>> blocksize R R R R(avg, R(std R
>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>> 67108864 13.033 12.122 11.950 82.911 3.110 1.295
>> 33554432 12.386 13.357 12.082 81.364 3.429 2.543
>> 16777216 12.102 11.542 12.053 86.096 1.860 5.381
>> 8388608 12.240 11.740 11.789 85.917 1.601 10.740
>> 4194304 11.824 12.388 12.042 84.768 1.621 21.192
>> 2097152 11.962 12.283 11.973 84.832 1.036 42.416
>> 1048576 12.639 11.863 12.010 84.197 2.290 84.197
>> 524288 11.809 12.919 11.853 84.121 3.439 168.243
>> 262144 12.105 12.649 12.779 81.894 1.940 327.577
>> 131072 12.441 12.769 12.713 81.017 0.923 648.137
>> 65536 12.490 13.308 12.440 80.414 2.457 1286.630
>> 32768 13.235 11.917 12.300 82.184 3.576 2629.883
>> 16384 12.335 12.394 12.201 83.187 0.549 5323.990
>>
>> 7) client: default, server: default max_sectors_kb, RA 2MB
>> blocksize R R R R(avg, R(std R
>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>> 67108864 12.017 12.334 12.151 84.168 0.897 1.315
>> 33554432 12.265 12.200 11.976 84.310 0.864 2.635
>> 16777216 12.356 11.972 12.292 83.903 1.165 5.244
>> 8388608 12.247 12.368 11.769 84.472 1.825 10.559
>> 4194304 11.888 11.974 12.144 85.325 0.754 21.331
>> 2097152 12.433 10.938 11.669 87.911 4.595 43.956
>> 1048576 11.748 12.271 12.498 84.180 2.196 84.180
>> 524288 11.726 11.681 12.322 86.031 2.075 172.062
>> 262144 12.593 12.263 11.939 83.530 1.817 334.119
>> 131072 11.874 12.265 12.441 84.012 1.648 672.093
>> 65536 12.119 11.848 12.037 85.330 0.809 1365.277
>> 32768 12.549 12.080 12.008 83.882 1.625 2684.238
>> 16384 12.369 12.087 12.589 82.949 1.385 5308.766
>>
>> 8) client: default, server: 64 max_sectors_kb, RA 2MB
>> blocksize R R R R(avg, R(std R
>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>> 67108864 12.664 11.793 11.963 84.428 2.575 1.319
>> 33554432 11.825 12.074 12.442 84.571 1.761 2.643
>> 16777216 11.997 11.952 10.905 88.311 3.958 5.519
>> 8388608 11.866 12.270 11.796 85.519 1.476 10.690
>> 4194304 11.754 12.095 12.539 84.483 2.230 21.121
>> 2097152 11.948 11.633 11.886 86.628 1.007 43.314
>> 1048576 12.029 12.519 11.701 84.811 2.345 84.811
>> 524288 11.928 12.011 12.049 85.363 0.361 170.726
>> 262144 12.559 11.827 11.729 85.140 2.566 340.558
>> 131072 12.015 12.356 11.587 85.494 2.253 683.952
>> 65536 11.741 12.113 11.931 85.861 1.093 1373.770
>> 32768 12.655 11.738 12.237 83.945 2.589 2686.246
>> 16384 11.928 12.423 11.875 84.834 1.711 5429.381
>>
>> 9) client: 64 max_sectors_kb, default RA. server: 64 max_sectors_kb, RA 2MB
>> blocksize R R R R(avg, R(std R
>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>> 67108864 13.570 13.491 14.299 74.326 1.927 1.161
>> 33554432 13.238 13.198 13.255 77.398 0.142 2.419
>> 16777216 13.851 13.199 13.463 75.857 1.497 4.741
>> 8388608 13.339 16.695 13.551 71.223 7.010 8.903
>> 4194304 13.689 13.173 14.258 74.787 2.415 18.697
>> 2097152 13.518 13.543 13.894 75.021 0.934 37.510
>> 1048576 14.119 14.030 13.820 73.202 0.659 73.202
>> 524288 13.747 14.781 13.820 72.621 2.369 145.243
>> 262144 14.168 13.652 14.165 73.189 1.284 292.757
>> 131072 14.112 13.868 14.213 72.817 0.753 582.535
>> 65536 14.604 13.762 13.725 73.045 2.071 1168.728
>> 32768 14.796 15.356 14.486 68.861 1.653 2203.564
>> 16384 13.079 13.525 13.427 76.757 1.111 4912.426
>>
>> 10) client: default max_sectors_kb, 2MB RA. server: 64 max_sectors_kb, RA 2MB
>> blocksize R R R R(avg, R(std R
>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>> 67108864 20.372 18.077 17.262 55.411 3.800 0.866
>> 33554432 17.287 17.620 17.828 58.263 0.740 1.821
>> 16777216 16.802 18.154 17.315 58.831 1.865 3.677
>> 8388608 17.510 18.291 17.253 57.939 1.427 7.242
>> 4194304 17.059 17.706 17.352 58.958 0.897 14.740
>> 2097152 17.252 18.064 17.615 58.059 1.090 29.029
>> 1048576 17.082 17.373 17.688 58.927 0.838 58.927
>> 524288 17.129 17.271 17.583 59.103 0.644 118.206
>> 262144 17.411 17.695 18.048 57.808 0.848 231.231
>> 131072 17.937 17.704 18.681 56.581 1.285 452.649
>> 65536 17.927 17.465 17.907 57.646 0.698 922.338
>> 32768 18.494 17.820 17.719 56.875 1.073 1819.985
>> 16384 18.800 17.759 17.575 56.798 1.666 3635.058
>>
>> 11) client: 64 max_sectors_kb, 2MB. RA server: 64 max_sectors_kb, RA 2MB
>> blocksize R R R R(avg, R(std R
>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>> 67108864 20.045 21.881 20.018 49.680 2.037 0.776
>> 33554432 20.768 20.291 20.464 49.938 0.479 1.561
>> 16777216 21.563 20.714 20.429 49.017 1.116 3.064
>> 8388608 21.290 21.109 21.308 48.221 0.205 6.028
>> 4194304 22.240 20.662 21.088 48.054 1.479 12.013
>> 2097152 20.282 21.098 20.580 49.593 0.806 24.796
>> 1048576 20.367 19.929 20.252 50.741 0.469 50.741
>> 524288 20.885 21.203 20.684 48.945 0.498 97.890
>> 262144 19.982 21.375 20.798 49.463 1.373 197.853
>> 131072 20.744 21.590 19.698 49.593 1.866 396.740
>> 65536 21.586 20.953 21.055 48.314 0.627 773.024
>> 32768 21.228 20.307 21.049 49.104 0.950 1571.327
>> 16384 21.257 21.209 21.150 48.289 0.100 3090.498
>
> The drop with 64 max_sectors_kb on the client is a consequence of how
> CFQ is working. I can't find the exact code responsible for this, but
> from all signs, CFQ stops delaying requests if amount of outstanding
> requests exceeds some threshold, which is 2 or 3. With 64 max_sectors_kb
> and 5 SCST I/O threads this threshold is exceeded, so CFQ doesn't
> recover order of requests, hence the performance drop. With default 512
> max_sectors_kb and 128K RA the server sees at max 2 requests at time.
>
> Ronald, can you perform the same tests with 1 and 2 SCST I/O threads,
> please?
With context-RA patch, please, in those and future tests, since it
should make RA for cooperative threads much better.
> You can limit amount of SCST I/O threads by num_threads parameter of
> scst_vdisk module.
>
> Thanks,
> Vlad
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists