linux-kernel - Re: linux-next: manual merge of the akpm tree with the tip tree

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170331160242.GF4543@tassilo.jf.intel.com>
Date:   Fri, 31 Mar 2017 09:02:42 -0700
From:   Andi Kleen <ak@...ux.intel.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Stephen Rothwell <sfr@...b.auug.org.au>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
        Linux-Next Mailing List <linux-next@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: linux-next: manual merge of the akpm tree with the tip tree

On Fri, Mar 31, 2017 at 04:45:46PM +0200, Peter Zijlstra wrote:
> On Fri, Mar 31, 2017 at 06:54:48AM -0700, Andi Kleen wrote:
> > > Argh!
> > > 
> > > Andrew, please drop that patch. And the x86 out-of-line of __atomic_add_unless().
> > 
> > Why dropping the second?  Do you have something better?
> 
> The try_cmpxchg() patches save about half the text, and do not have the
> out-of-line penalty as shown here:
> 
>    https://lkml.kernel.org/r/20170322165144.dtidvvbxey7w5pbd@hirez.programming.kicks-ass.net

Where is the source for the benchmark?

Based on the description it sounds like it's testing atomic_inc(), which my patches
don't change.

BTW testing such things in tight loops is bad practice. If you run
them back to back the CPU pipeline has to do much more serialization,
which is usually not realistic and drastically overestimates
the overhead.

A better practice is to run some real workload. If you want to see
cycle counts you can look at LBR cycles, or PT cycles from sampling or tracing.

> > On the first there were no 0day regressions, so at least basic performance
> > checking has been done.
> 
> The first is superseded by much better patches in the scheduler tree.

Which patches exactly?  The new patches shrink the text too?

-Andi