[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <95FB622E-4E2B-4EFC-998F-4A3522BA27BD@gmail.com>
Date: Thu, 1 Feb 2018 10:45:55 -0800
From: Nadav Amit <nadav.amit@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Dave Hansen <dave.hansen@...ux.intel.com>,
the arch/x86 maintainers <x86@...nel.org>,
Andy Lutomirski <luto@...nel.org>,
"H. Peter Anvin" <hpa@...or.com>,
LKML <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH] x86: Align TLB invalidation info
Peter Zijlstra <peterz@...radead.org> wrote:
> On Wed, Jan 31, 2018 at 09:38:46PM -0800, Nadav Amit wrote:
>
>> I used ftrace to measure the execution time of flush_tlb_func_remote() on a
>> 2-socket Haswell machine, using a microbenchmark I wrote for some research
>> project.
>
> However cool ftrace is, it is _really_ bad for such uses. The cost of
> using ftrace is many many time higher than any change you could affect
> by this.
>
> A microbench and/or perf is what you should use for this.
Don’t expect to see a remote NUMA access impact, whose cost are few 10s of
nanoseconds on microbenchmarks. (And indeed I did not.) Each iteration of
#PF - MADV_DONTNEED takes several microseconds, and the impact is lost in
the noise.
You are right in the fact that ftrace introduces overheads, but the variance
is relatively low. If I stretch the struct to 3 lines of cache, I see a 20ns
overhead. Anyhow, I think this line of code got more than its fair share of
attention.
Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)
Powered by blists - more mailing lists