lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130701102306.GC23515@pd.tnic>
Date:	Mon, 1 Jul 2013 12:23:06 +0200
From:	Borislav Petkov <bp@...en8.de>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	Wedson Almeida Filho <wedsonaf@...il.com>,
	Ingo Molnar <mingo@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
	linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [PATCH] x86: Use asm-goto to implement mutex fast path on x86-64

On Mon, Jul 01, 2013 at 09:50:46AM +0200, Ingo Molnar wrote:
> Not sure - the main thing we want to know is whether it gets faster.
> The _amount_ will depend on things like precise usage patterns,
> caching, etc. - but rarely does a real workload turn a win like this
> into a loss.

Yep, and it does get faster by a whopping 6 seconds!

Almost all standard counters go down a bit.

Interestingly, branch misses get a slight increase and the asm goto
thing does actually jump to the fail_fn from within the asm so maybe
this could puzzle the branch predictor a bit. Although the instructions
look the same and jumps are both forward.

Oh well, we don't know where those additional misses happened so it
could be somewhere else entirely, or it is simply noise.

In any case, we're getting faster, so not worth investigating I guess.


plain 3.10
==========

 Performance counter stats for '../build-kernel.sh' (5 runs):

    1312558.712266 task-clock                #    5.961 CPUs utilized            ( +-  0.02% )
         1,036,629 context-switches          #    0.790 K/sec                    ( +-  0.24% )
            55,118 cpu-migrations            #    0.042 K/sec                    ( +-  0.25% )
        46,505,184 page-faults               #    0.035 M/sec                    ( +-  0.00% )
 4,768,420,289,997 cycles                    #    3.633 GHz                      ( +-  0.02% ) [83.79%]
 3,424,161,066,397 stalled-cycles-frontend   #   71.81% frontend cycles idle     ( +-  0.02% ) [83.78%]
 2,483,143,574,419 stalled-cycles-backend    #   52.07% backend  cycles idle     ( +-  0.04% ) [67.40%]
 3,091,612,061,933 instructions              #    0.65  insns per cycle
                                             #    1.11  stalled cycles per insn  ( +-  0.01% ) [83.93%]
   677,787,215,988 branches                  #  516.386 M/sec                    ( +-  0.01% ) [83.77%]
    25,438,736,368 branch-misses             #    3.75% of all branches          ( +-  0.02% ) [83.78%]

     220.191740778 seconds time elapsed                                          ( +-  0.32% )

 + patch
========

 Performance counter stats for '../build-kernel.sh' (5 runs):

    1309995.427337 task-clock                #    6.106 CPUs utilized            ( +-  0.09% )
         1,033,446 context-switches          #    0.789 K/sec                    ( +-  0.23% )
            55,228 cpu-migrations            #    0.042 K/sec                    ( +-  0.28% )
        46,484,992 page-faults               #    0.035 M/sec                    ( +-  0.00% )
 4,759,631,961,013 cycles                    #    3.633 GHz                      ( +-  0.09% ) [83.78%]
 3,415,933,806,156 stalled-cycles-frontend   #   71.77% frontend cycles idle     ( +-  0.12% ) [83.78%]
 2,476,066,765,933 stalled-cycles-backend    #   52.02% backend  cycles idle     ( +-  0.10% ) [67.38%]
 3,089,317,073,397 instructions              #    0.65  insns per cycle
                                             #    1.11  stalled cycles per insn  ( +-  0.02% ) [83.95%]
   677,623,252,827 branches                  #  517.271 M/sec                    ( +-  0.01% ) [83.79%]
    25,444,376,740 branch-misses             #    3.75% of all branches          ( +-  0.02% ) [83.79%]

     214.533868029 seconds time elapsed                                          ( +-  0.36% )

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ