[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121121114728.GZ8218@suse.de>
Date: Wed, 21 Nov 2012 11:47:28 +0000
From: Mel Gorman <mgorman@...e.de>
To: Ingo Molnar <mingo@...nel.org>
Cc: David Rientjes <rientjes@...gle.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Paul Turner <pjt@...gle.com>,
Lee Schermerhorn <Lee.Schermerhorn@...com>,
Christoph Lameter <cl@...ux.com>,
Rik van Riel <riel@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Andrea Arcangeli <aarcange@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Johannes Weiner <hannes@...xchg.org>,
Hugh Dickins <hughd@...gle.com>
Subject: Re: [PATCH] x86/mm: Don't flush the TLB on #WP pmd fixups
On Tue, Nov 20, 2012 at 01:31:56PM +0100, Ingo Molnar wrote:
>
> * Ingo Molnar <mingo@...nel.org> wrote:
>
> > * Ingo Molnar <mingo@...nel.org> wrote:
> >
> > > numa/core profile:
> > >
> > > 95.66% perf-1201.map [.] 0x00007fe4ad1c8fc7
> > > 1.70% libjvm.so [.] 0x0000000000381581
> > > 0.59% [vdso] [.] 0x0000000000000607
> > > 0.19% [kernel] [k] do_raw_spin_lock
> > > 0.11% [kernel] [k] generic_smp_call_function_interrupt
> > > 0.11% [kernel] [k] timekeeping_get_ns.constprop.7
> > > 0.08% [kernel] [k] ktime_get
> > > 0.06% [kernel] [k] get_cycles
> > > 0.05% [kernel] [k] __native_flush_tlb
> > > 0.05% [kernel] [k] rep_nop
> > > 0.04% perf [.] add_hist_entry.isra.9
> > > 0.04% [kernel] [k] rcu_check_callbacks
> > > 0.04% [kernel] [k] ktime_get_update_offsets
> > > 0.04% libc-2.15.so [.] __strcmp_sse2
> > >
> > > No page fault overhead (see the page fault rate further below)
> > > - the NUMA scanning overhead shows up only through some mild
> > > TLB flush activity (which I'll fix btw).
> >
> > The patch attached below should get rid of that mild TLB
> > flushing activity as well.
>
> This has further increased SPECjbb from 203k/sec to 207k/sec,
> i.e. it's now 5% faster than mainline - THP enabled.
>
> The profile is now totally flat even during a full 32-WH SPECjbb
> run, with the highest overhead entries left all related to timer
> IRQ processing or profiling. That is on a system that should be
> very close to yours.
>
This is a stab in the dark but are you always running with profiling enabled?
I have not checked this with perf but a number of years ago I found that
oprofile could distort results really badly (7-30% depending on the workload
at the time) when I was evalating hugetlbfs and THP. In some cases I would
find that profiling would show that a patch series improved performance
when the same series showed regressions if profiling was disabled. The
sampling rate had to be reduced quite a bit to avoid this effect.
--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists