lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fdc14384-e3b7-921c-cc52-5992f3cbfd2c@suse.cz>
Date:   Wed, 16 Feb 2022 15:24:56 +0100
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Mel Gorman <mgorman@...hsingularity.net>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     Aaron Lu <aaron.lu@...el.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Michal Hocko <mhocko@...nel.org>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>
Subject: Re: [PATCH 4/5] mm/page_alloc: Free pages in a single pass during
 bulk free

On 2/15/22 15:51, Mel Gorman wrote:
> free_pcppages_bulk() has taken two passes through the pcp lists since
> commit 0a5f4e5b4562 ("mm/free_pcppages_bulk: do not hold lock when picking
> pages to free") due to deferring the cost of selecting PCP lists until
> the zone lock is held. Now that list selection is simplier, the main
> cost during selection is bulkfree_pcp_prepare() which in the normal case
> is a simple check and prefetching. As the list manipulations have cost
> in itself, go back to freeing pages in a single pass.
> 
> The series up to this point was evaulated using a trunc microbenchmark
> that is truncating sparse files stored in page cache (mmtests config
> config-io-trunc). Sparse files were used to limit filesystem interaction.
> 
> The results versus a revert of storing high-order pages in the PCP lists is
> 
> 1-socket Skylake
>                               5.17.0-rc3             5.17.0-rc3             5.17.0-rc3
>                                  vanilla    mm-reverthighpcp-v1r1     mm-highpcpopt-v1
> Min       elapsed      540.00 (   0.00%)      530.00 (   1.85%)      530.00 (   1.85%)
> Amean     elapsed      543.00 (   0.00%)      530.00 *   2.39%*      530.00 *   2.39%*
> Stddev    elapsed        4.83 (   0.00%)        0.00 ( 100.00%)        0.00 ( 100.00%)
> CoeffVar  elapsed        0.89 (   0.00%)        0.00 ( 100.00%)        0.00 ( 100.00%)
> Max       elapsed      550.00 (   0.00%)      530.00 (   3.64%)      530.00 (   3.64%)
> BAmean-50 elapsed      540.00 (   0.00%)      530.00 (   1.85%)      530.00 (   1.85%)
> BAmean-95 elapsed      542.22 (   0.00%)      530.00 (   2.25%)      530.00 (   2.25%)
> BAmean-99 elapsed      542.22 (   0.00%)      530.00 (   2.25%)      530.00 (   2.25%)
> 
> 2-socket CascadeLake
>                               5.17.0-rc3             5.17.0-rc3             5.17.0-rc3
>                                  vanilla    mm-reverthighpcp-v1       mm-highpcpopt-v1
> Min       elapsed      510.00 (   0.00%)      500.00 (   1.96%)      500.00 (   1.96%)
> Amean     elapsed      529.00 (   0.00%)      521.00 (   1.51%)      516.00 *   2.46%*
> Stddev    elapsed       16.63 (   0.00%)       12.87 (  22.64%)        9.66 (  41.92%)
> CoeffVar  elapsed        3.14 (   0.00%)        2.47 (  21.46%)        1.87 (  40.45%)
> Max       elapsed      550.00 (   0.00%)      540.00 (   1.82%)      530.00 (   3.64%)
> BAmean-50 elapsed      516.00 (   0.00%)      512.00 (   0.78%)      510.00 (   1.16%)
> BAmean-95 elapsed      526.67 (   0.00%)      518.89 (   1.48%)      514.44 (   2.32%)
> BAmean-99 elapsed      526.67 (   0.00%)      518.89 (   1.48%)      514.44 (   2.32%)
> 
> The original motivation for multi-passes was will-it-scale page_fault1
> using $nr_cpu processes.
> 
> 2-socket CascadeLake (40 cores, 80 CPUs HT enabled)
>                                                     5.17.0-rc3                 5.17.0-rc3
>                                                        vanilla         mm-highpcpopt-v1r4
> Hmean     page_fault1-processes-2        2694662.26 (   0.00%)      2696801.07 (   0.08%)
> Hmean     page_fault1-processes-5        6425819.34 (   0.00%)      6426573.21 (   0.01%)
> Hmean     page_fault1-processes-8        9642169.10 (   0.00%)      9647444.94 (   0.05%)
> Hmean     page_fault1-processes-12      12167502.10 (   0.00%)     12073323.10 *  -0.77%*
> Hmean     page_fault1-processes-21      15636859.03 (   0.00%)     15587449.50 *  -0.32%*
> Hmean     page_fault1-processes-30      25157348.61 (   0.00%)     25111707.15 *  -0.18%*
> Hmean     page_fault1-processes-48      27694013.85 (   0.00%)     27728568.63 (   0.12%)
> Hmean     page_fault1-processes-79      25928742.64 (   0.00%)     25920933.41 (  -0.03%) <---
> Hmean     page_fault1-processes-110     25730869.75 (   0.00%)     25695727.57 *  -0.14%*
> Hmean     page_fault1-processes-141     25626992.42 (   0.00%)     25675346.68 *   0.19%*
> Hmean     page_fault1-processes-172     25611651.35 (   0.00%)     25650940.14 *   0.15%*
> Hmean     page_fault1-processes-203     25577298.75 (   0.00%)     25584848.65 (   0.03%)
> Hmean     page_fault1-processes-234     25580686.07 (   0.00%)     25601794.52 *   0.08%*
> Hmean     page_fault1-processes-265     25570215.47 (   0.00%)     25553191.25 (  -0.07%)
> Hmean     page_fault1-processes-296     25549488.62 (   0.00%)     25530311.58 (  -0.08%)
> Hmean     page_fault1-processes-320     25555149.05 (   0.00%)     25585059.83 (   0.12%)
> 
> The differences are mostly within the noise and the difference close to
> $nr_cpus is negligible.
> 
> Signed-off-by: Mel Gorman <mgorman@...hsingularity.net>


Reviewed-by: Vlastimil Babka <vbabka@...e.cz>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ