linux-ext4 - Re: [PATCH 2/2] ext4: Reduce contention on s_orphan

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <538CB83C.9080409@hp.com>
Date:	Mon, 02 Jun 2014 11:45:32 -0600
From:	Thavatchai Makphaibulchoke <thavatchai.makpahibulchoke@...com>
To:	Theodore Ts'o <tytso@....edu>
CC:	Jan Kara <jack@...e.cz>, linux-ext4@...r.kernel.org
Subject: Re: [PATCH 2/2] ext4: Reduce contention on s_orphan_lock

On 05/20/2014 07:57 AM, Theodore Ts'o wrote:
> On Tue, May 20, 2014 at 02:33:23AM -0600, Thavatchai Makphaibulchoke wrote:
> 
> Thavatchai, it would be really great if you could do lock_stat runs
> with both Jan's latest patches as well as yours.  We need to
> understand where the differences are coming from.
> 
> As I understand things, there are two differences between Jan and your
> approaches.  The first is that Jan is using the implicit locking of
> i_mutex to avoid needing to keep a hashed array of mutexes to
> synchronize an individual inode's being added or removed to the orphan
> list.
> 
> The second is that you've split the orphan mutex into an on-disk mutex
> and a in-memory spinlock.
> 
> Is it possible to split up your patch so we can measure the benefits
> of each of these two changes?  More interestingly, is there a way we
> can use the your second change in concert with Jan's changes?
> 
> Regards,
> 
> 						- Ted
> 

Thanks to Jan, as she pointed out one optimization in orphan_addr() that I've missed.

After integrated that into my patch, I've rerun the following aim7 workloads; alltests, custom, dbase, disk, fserver, new_fserver, shared and short.  Here are the results.

On an 8 core (16 thread) machine, both my revised patch (with additional optimization from Jan's oprhan_add()) and version 3 of Jan's patch give about the same results, for most of the workloads, except fserver and new_fserver, which Jan's outperforms about 9% and 16%, respectively.

Here are the lock_stat output for disk,
Jan's patch,
lock_stat version 0.4
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                              class name    con-bounces    contentions   waittime-min   waittime-max waittime-total   waittime-avg    acq-bounces   acquisitions   holdtime-min   holdtime-max holdtime-total   holdtime-avg
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                     &sbi->s_orphan_lock:         80189          80246           3.94      464489.22    77219615.47         962.29         503289         809004           0.10      476537.44     3424587.77           4.23
Mine,
                     &sbi->s_orphan_lock:         82215          82259           3.61      640876.19    15098561.09         183.55         541422         794254           0.10      640368.86     4425140.61           5.57
              &sbi->s_orphan_op_mutex[n]:        102507         104880           4.21     1335087.21  1392773487.19       13279.69         398328         840120           0.11     1334711.17   397596137.90         473.26

For new_fserver,
Jan's patch,
                     &sbi->s_orphan_lock:       1063059        1063369           5.57     1073325.95 59535205188.94       55987.34        4525570        8446052           0.10       75625.72    10700844.58         1.27
Mine,
                     &sbi->s_orphan_lock:       1171433        1172220           3.02      349678.21   553168029.92         471.90        5517262        8446052           0.09      254108.75    16504015.29           1.95
              &sbi->s_orphan_op_mutex[n]:       2176760        2202674           3.44      633129.10 55206091750.06       25063.21        3259467        8452918           0.10      349687.82   605683982.34        71.65


On an 80 core (160 thread) machine, mine outpeforms Jan's in alltests, custom, fserver, new_fserver and shared about the same margin it did over the baseline, around 20%   For all these workloads, Jan's patch does not seem to show any noticeable improvement over baseline kernel.  I'm getting about the same performance with the rest of the workloads.

Here are the lock_stat output for alltests,
Jan;'s,
                     &sbi->s_orphan_lock:       2762871        2763355           4.46       49043.39  1763499587.40         638.17        5878253        6475844           0.15       20508.98    70827300.79          10.94
Mine,
                       &sbi->s_orphan_lock:       1171433        1172220           3.02      349678.21   553168029.92         471.90        5517262        8446052           0.09      254108.75    16504015.29           1.95
              &sbi->s_orphan_op_mutex[n]:        783176         785840           4.95       30358.58   432279688.66         550.09        2899889        6505883           0.16       30254.12  1668330140.08         256.43

For custom,
Jan's,
                     &sbi->s_orphan_lock:       5706466        5707069           4.54       44063.38  3312864313.18         580.48       11942088       13175060           0.15       15944.34   142660367.51          10.83
Mine,
                     &sbi->s_orphan_lock:       5518186        5518558           4.84       32040.05  2436898419.22         441.58       12290996       13175234           0.17       23160.65   141234888.88          10.72
              &sbi->s_orphan_op_mutex[n]:       1565216        1569333           4.50       32527.02   788215876.94         502.26        5894074       13196979           0.16       71073.57  3128766227.92         237.08

For dbase,
Jan's,
                     &sbi->s_orphan_lock:         14453          14489           5.84       39442.57     8678179.21         598.95         119847         153686           0.17        4390.25     1406816.03           9.15
Mine,
                     &sbi->s_orphan_lock:         13847          13868           6.23       31314.03     7982386.22         575.60         120332         153542           0.17        9354.86     1458061.28           9.50
              &sbi->s_orphan_op_mutex[n]:          1700           1717          22.00       50566.24     1225749.82         713.89          85062         189435           0.16       31374.44    14476217.56          76.42

In case the line-wrap making it hard to read, I've also attached the results as a text file.

The lock_stat seems to show that with my patch the s_orphan_lock performs better across the board.  But on a smaller machine, the hashed mutex seems to offset out the performance gain in the s_oprhan_lock and increase the hashed mutex size likely to make it perform better.

Jan, if you could send me your orphan stress test, I could run lock_stat for more performance comparison.

Ted, please let me know if there is anything else you like me to experiment with.  If you'd like I could also resubmit my revised patch for you to take a look. 



Thanks,
Mak.





View attachment "lock_stat.txt" of type "text/plain" (4913 bytes)