[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170331160242.GF4543@tassilo.jf.intel.com>
Date: Fri, 31 Mar 2017 09:02:42 -0700
From: Andi Kleen <ak@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Stephen Rothwell <sfr@...b.auug.org.au>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
Linux-Next Mailing List <linux-next@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: linux-next: manual merge of the akpm tree with the tip tree
On Fri, Mar 31, 2017 at 04:45:46PM +0200, Peter Zijlstra wrote:
> On Fri, Mar 31, 2017 at 06:54:48AM -0700, Andi Kleen wrote:
> > > Argh!
> > >
> > > Andrew, please drop that patch. And the x86 out-of-line of __atomic_add_unless().
> >
> > Why dropping the second? Do you have something better?
>
> The try_cmpxchg() patches save about half the text, and do not have the
> out-of-line penalty as shown here:
>
> https://lkml.kernel.org/r/20170322165144.dtidvvbxey7w5pbd@hirez.programming.kicks-ass.net
Where is the source for the benchmark?
Based on the description it sounds like it's testing atomic_inc(), which my patches
don't change.
BTW testing such things in tight loops is bad practice. If you run
them back to back the CPU pipeline has to do much more serialization,
which is usually not realistic and drastically overestimates
the overhead.
A better practice is to run some real workload. If you want to see
cycle counts you can look at LBR cycles, or PT cycles from sampling or tracing.
> > On the first there were no 0day regressions, so at least basic performance
> > checking has been done.
>
> The first is superseded by much better patches in the scheduler tree.
Which patches exactly? The new patches shrink the text too?
-Andi
Powered by blists - more mailing lists