lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <j2t4e5e476b1004201350xbd08a002l3329fbdd4fb1b8db@mail.gmail.com>
Date:	Tue, 20 Apr 2010 22:50:09 +0200
From:	Corrado Zoccolo <czoccolo@...il.com>
To:	Miklos Szeredi <mszeredi@...e.cz>
Cc:	Jens Axboe <jens.axboe@...cle.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Jan Kara <jack@...e.cz>, Suresh Jayaraman <sjayaraman@...e.de>
Subject: Re: CFQ read performance regression

On Mon, Apr 19, 2010 at 1:46 PM, Miklos Szeredi <mszeredi@...e.cz> wrote:
> Hi Corrado,
>
> On Sat, 2010-04-17 at 14:46 +0200, Corrado Zoccolo wrote:
>> I don't think this is related to CFQ. I've made a graph of the
>> accessed (read) sectors (see attached).
>> You can see that the green cloud (2.6.16) is much more concentrated,
>> while the red one (2.6.32) is split in two, and you can better
>> recognize the different lines.
>> This means that the FS put more distance between the blocks of the
>> files written by the tio threads, and the read time is therefore
>> impacted, since the disk head has to perform longer seeks. On the
>> other hand, if you read those files sequentially with a single thread,
>> the performance may be better with the new layout, so YMMV. When
>> testing 2.6.32 and up, you should consider testing also with
>> low_latency setting disabled, since tuning for latency can negatively
>> affect throughput.
Hi Miklos,
can you give more information about the setup?
How much memory do you have, what is the disk configuration (is this a
hw raid?) and so on.
>
> low_latency is set to zero in all tests.
>
> The layout difference doesn't explain why setting the scheduler to
> "noop" consistently speeds up read throughput in 8-thread tiobench to
> almost twice.  This fact alone pretty clearly indicates that the I/O
> scheduler is the culprit.
From the attached btt output, I see that a lot of time is spent
waiting to allocate new request structures.
> S2G               0.022460311   6.581680621  23.144763751          15
Since noop doesn't attach fancy data to each request, it can save
those allocations, thus resulting in no sleeps.
The delays in allocation, though, may not be completely imputable to
the I/O scheduler, and working in constrained memory conditions will
negatively affect it.

> There are other indications, see the attached btt output for both
> traces.  From there it appears that 2.6.16 does more and longer seeks,
> yet it's getting an overall better performance.
I see less seeks for 2.6.16, but longer on average.
It seems that 2.6.16 allows more requests from the same process to be
streamed to disk before switching to an other process.
Since the timeslice is the same, it might be that we are limiting the
number of requests per queue due to memory congestion.

> I've also tested with plain "dd" instead of tiobench where the
> filesystem layout stayed exactly the same between tests.  Still the
> speed difference is there.
Does dropping caches before the read test change the situation?

Thanks,
Corrado
>
> Thanks,
> Miklos
>
> ************************************************************************
> btt output for 2.6.16:
> ==================== All Devices ====================
>
>            ALL           MIN           AVG           MAX           N
> --------------- ------------- ------------- ------------- -----------
>
> Q2Q               0.000000047   0.000854417   1.003550405       67465
> Q2G               0.000000458   0.000001211   0.000123527       46062
> G2I               0.000000123   0.000001815   0.000494517       46074
> Q2M               0.000000186   0.000001798   0.000010296       21404
> I2D               0.000000162   0.000158001   0.040794333       46062
> M2D               0.000000878   0.000133130   0.040585566       21404
> D2C               0.000053870   0.023778266   0.234154543       67466
> Q2C               0.000056746   0.023931014   0.234176000       67466
>
> ==================== Device Overhead ====================
>
>       DEV |       Q2G       G2I       Q2M       I2D       D2C
> ---------- | --------- --------- --------- --------- ---------
>  (  8, 64) |   0.0035%   0.0052%   0.0024%   0.4508%  99.3617%
> ---------- | --------- --------- --------- --------- ---------
>   Overall |   0.0035%   0.0052%   0.0024%   0.4508%  99.3617%
>
> ==================== Device Merge Information ====================
>
>       DEV |       #Q       #D   Ratio |   BLKmin   BLKavg   BLKmax    Total
> ---------- | -------- -------- ------- | -------- -------- -------- --------
>  (  8, 64) |    67466    46062     1.5 |        8      597     1024 27543688
>
> ==================== Device Q2Q Seek Information ====================
>
>       DEV |          NSEEKS            MEAN          MEDIAN | MODE
> ---------- | --------------- --------------- --------------- | ---------------
>  (  8, 64) |           67466        866834.0               0 | 0(34558)
> ---------- | --------------- --------------- --------------- | ---------------
>   Overall |          NSEEKS            MEAN          MEDIAN | MODE
>   Average |           67466        866834.0               0 | 0(34558)
>
> ==================== Device D2D Seek Information ====================
>
>       DEV |          NSEEKS            MEAN          MEDIAN | MODE
> ---------- | --------------- --------------- --------------- | ---------------
>  (  8, 64) |           46062       1265503.9               0 | 0(13242)
> ---------- | --------------- --------------- --------------- | ---------------
>   Overall |          NSEEKS            MEAN          MEDIAN | MODE
>   Average |           46062       1265503.9               0 | 0(13242)
>
> ==================== Plug Information ====================
>
>       DEV |    # Plugs # Timer Us  | % Time Q Plugged
> ---------- | ---------- ----------  | ----------------
>  (  8, 64) |      29271(       533) |   3.878105328%
>
>       DEV |    IOs/Unp   IOs/Unp(to)
> ---------- | ----------   ----------
>  (  8, 64) |       19.2         19.7
> ---------- | ----------   ----------
>   Overall |    IOs/Unp   IOs/Unp(to)
>   Average |       19.2         19.7
>
> ==================== Active Requests At Q Information ====================
>
>       DEV |  Avg Reqs @ Q
> ---------- | -------------
>  (  8, 64) |           0.8
>
> ==================== I/O Active Period Information ====================
>
>       DEV |     # Live      Avg. Act     Avg. !Act % Live
> ---------- | ---------- ------------- ------------- ------
>  (  8, 64) |        545   0.100012237   0.005766640  94.56
> ---------- | ---------- ------------- ------------- ------
>  Total Sys |        545   0.100012237   0.005766640  94.56
>
> ************************************************************************
> btt output for 2.6.32:
>
> ==================== All Devices ====================
>
>            ALL           MIN           AVG           MAX           N
> --------------- ------------- ------------- ------------- -----------
>
> Q2Q               0.000000279   0.001710581   1.803934429       69429
> Q2G               0.000000908   0.001798735  23.144764798       54940
> S2G               0.022460311   6.581680621  23.144763751          15
> G2I               0.000000628   0.000001576   0.000120409       54942
> Q2M               0.000000628   0.000001611   0.000013201       14490
> I2D               0.000000768   0.289812035  86.820205789       54940
> M2D               0.000005518   0.098208187   0.794441158       14490
> D2C               0.000173141   0.008056256   0.219516385       69430
> Q2C               0.000179077   0.259305605  86.820559403       69430
>
> ==================== Device Overhead ====================
>
>       DEV |       Q2G       G2I       Q2M       I2D       D2C
> ---------- | --------- --------- --------- --------- ---------
>  (  8, 64) |   0.5489%   0.0005%   0.0001%  88.4394%   3.1069%
> ---------- | --------- --------- --------- --------- ---------
>   Overall |   0.5489%   0.0005%   0.0001%  88.4394%   3.1069%
>
> ==================== Device Merge Information ====================
>
>       DEV |       #Q       #D   Ratio |   BLKmin   BLKavg   BLKmax    Total
> ---------- | -------- -------- ------- | -------- -------- -------- --------
>  (  8, 64) |    69430    54955     1.3 |        8      520     2048 28614984
>
> ==================== Device Q2Q Seek Information ====================
>
>       DEV |          NSEEKS            MEAN          MEDIAN | MODE
> ---------- | --------------- --------------- --------------- | ---------------
>  (  8, 64) |           69430        546377.3               0 | 0(50235)
> ---------- | --------------- --------------- --------------- | ---------------
>   Overall |          NSEEKS            MEAN          MEDIAN | MODE
>   Average |           69430        546377.3               0 | 0(50235)
>
> ==================== Device D2D Seek Information ====================
>
>       DEV |          NSEEKS            MEAN          MEDIAN | MODE
> ---------- | --------------- --------------- --------------- | ---------------
>  (  8, 64) |           54955        565286.3               0 | 0(37535)
> ---------- | --------------- --------------- --------------- | ---------------
>   Overall |          NSEEKS            MEAN          MEDIAN | MODE
>   Average |           54955        565286.3               0 | 0(37535)
>
> ==================== Plug Information ====================
>
>       DEV |    # Plugs # Timer Us  | % Time Q Plugged
> ---------- | ---------- ----------  | ----------------
>  (  8, 64) |       2310(         0) |   0.049353257%
>
>       DEV |    IOs/Unp   IOs/Unp(to)
> ---------- | ----------   ----------
>  (  8, 64) |        7.3          0.0
> ---------- | ----------   ----------
>   Overall |    IOs/Unp   IOs/Unp(to)
>   Average |        7.3          0.0
>
> ==================== Active Requests At Q Information ====================
>
>       DEV |  Avg Reqs @ Q
> ---------- | -------------
>  (  8, 64) |         132.8
>
> ==================== I/O Active Period Information ====================
>
>       DEV |     # Live      Avg. Act     Avg. !Act % Live
> ---------- | ---------- ------------- ------------- ------
>  (  8, 64) |       4835   0.023848998   0.000714665  97.09
> ---------- | ---------- ------------- ------------- ------
>  Total Sys |       4835   0.023848998   0.000714665  97.09
>
>
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ