lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <398e9887-6d6e-e1d3-abcf-43a6d7496bc8@intel.com>
Date:   Mon, 17 Jul 2017 09:02:36 -0700
From:   Dave Hansen <dave.hansen@...el.com>
To:     daniel.m.jordan@...cle.com, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: [RFC PATCH v1 5/6] mm: parallelize clear_gigantic_page

On 07/14/2017 03:16 PM, daniel.m.jordan@...cle.com wrote:
> Machine:  Intel(R) Xeon(R) CPU E7-8895 v3 @ 2.60GHz, 288 cpus, 1T memory
> Test:    Clear a range of gigantic pages
> nthread   speedup   size (GiB)   min time (s)   stdev
>       1                    100          41.13    0.03
>       2     2.03x          100          20.26    0.14
>       4     4.28x          100           9.62    0.09
>       8     8.39x          100           4.90    0.05
>      16    10.44x          100           3.94    0.03
...
>       1                    800         434.91    1.81
>       2     2.54x          800         170.97    1.46
>       4     4.98x          800          87.38    1.91
>       8    10.15x          800          42.86    2.59
>      16    12.99x          800          33.48    0.83

What was the actual test here?  Did you just use sysfs to allocate 800GB
of 1GB huge pages?

This test should be entirely memory-bandwidth-limited, right?  Are you
contending here that a single core can only use 1/10th of the memory
bandwidth when clearing a page?

Or, does all the gain here come because we are round-robin-allocating
the pages across all 8 NUMA nodes' memory controllers and the speedup
here is because we're not doing the clearing across the interconnect?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ