linux-kernel - Re: linux-next: manual merge of the akpm tree with the tip tree

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20170331174818.6sqwonjhuonjmpif@hirez.programming.kicks-ass.net>
Date:   Fri, 31 Mar 2017 19:48:18 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Andi Kleen <ak@...ux.intel.com>
Cc:     Stephen Rothwell <sfr@...b.auug.org.au>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
        Linux-Next Mailing List <linux-next@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: linux-next: manual merge of the akpm tree with the tip tree

On Fri, Mar 31, 2017 at 09:02:42AM -0700, Andi Kleen wrote:
> On Fri, Mar 31, 2017 at 04:45:46PM +0200, Peter Zijlstra wrote:
> > On Fri, Mar 31, 2017 at 06:54:48AM -0700, Andi Kleen wrote:
> > > > Argh!
> > > > 
> > > > Andrew, please drop that patch. And the x86 out-of-line of __atomic_add_unless().
> > > 
> > > Why dropping the second?  Do you have something better?
> > 
> > The try_cmpxchg() patches save about half the text, and do not have the
> > out-of-line penalty as shown here:
> > 
> >    https://lkml.kernel.org/r/20170322165144.dtidvvbxey7w5pbd@hirez.programming.kicks-ass.net
> 
> Where is the source for the benchmark?

In that email; heck marc.info even provides a downloadable link, you
don't even have to go find it in your local lkml archives.

> Based on the description it sounds like it's testing atomic_inc(),
> which my patches don't change.

Yes, reading is hard.

It tests:

 lock incl

vs

 call refcount_inc

vs

 $inlined refcount_inc

And refcount_inc() is more complex than add_unless().

> BTW testing such things in tight loops is bad practice. If you run
> them back to back the CPU pipeline has to do much more serialization,
> which is usually not realistic and drastically overestimates
> the overhead.
> 
> A better practice is to run some real workload. If you want to see
> cycle counts you can look at LBR cycles, or PT cycles from sampling or tracing.

Hey, at least I did benchmark it. You just waved your hands and are
causing extra work for other people.

> > > On the first there were no 0day regressions, so at least basic performance
> > > checking has been done.
> > 
> > The first is superseded by much better patches in the scheduler tree.
> 
> Which patches exactly?  The new patches shrink the text too?

Try your local google foo; or look at the patch that conflicted, its
that one and the next.

In the end it comes down to -mm carrying patches against trees that are
maintained elsewhere without acks from said maintainers. I don't feel
bad about causing conflicts.