lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 15 Dec 2022 21:47:10 +0000
From:   Matthew Wilcox <willy@...radead.org>
To:     Nico Pache <npache@...hat.com>
Cc:     Sidhartha Kumar <sidhartha.kumar@...cle.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        muchun.song@...ux.dev, mike.kravetz@...cle.com,
        akpm@...ux-foundation.org, gerald.schaefer@...ux.ibm.com,
        Waiman Long <llong@...hat.com>
Subject: Re: [RFC V2] mm: add the zero case to page[1].compound_nr in
 set_compound_order

On Thu, Dec 15, 2022 at 02:38:28PM -0700, Nico Pache wrote:
> To expand a little more on the analysis:
> I computed the latency/throughput between <+24> and <+27> using
> intel's manual (APPENDIX D):
> 
> The bitmath solutions shows a total latency of 2.5 with a Throughput of 0.5.
> The branch solution show a total latency of 4 and throughput of 1.5.
> 
> Given this is not a tight loop, and the next instruction is requiring
> the data computed, better (lower) latency is the more ideal situation.
> 
> Just wanted to add that little piece :)

I appreciate how hard you're working on this, but it really is straining
at gnats ;-)  For a modern cpu, the most important thing is cache misses
and avoiding dirtying cachelines.  Cycle counting isn't that important
when an L3 cache miss takes 2000 (or more) cycles.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ