[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFzAZwC69kZg3W8Mp6ERQOXXY3=U7jKpWqxDWi6W=6Z12Q@mail.gmail.com>
Date: Mon, 19 Oct 2015 08:21:35 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Ingo Molnar <mingo@...nel.org>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
Namhyung Kim <namhyung@...nel.org>,
David Ahern <dsahern@...il.com>, Jiri Olsa <jolsa@...hat.com>,
Hitoshi Mitake <mitake@....info.waseda.ac.jp>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH 02/14] perf/bench: Default to all routines in 'perf bench mem'
On Mon, Oct 19, 2015 at 1:04 AM, Ingo Molnar <mingo@...nel.org> wrote:
>
> triton:~> perf bench mem all
> # Running mem/memcpy benchmark...
> Routine default (Default memcpy() provided by glibc)
> 4.957170 GB/Sec (with prefault)
> Routine x86-64-unrolled (unrolled memcpy() in arch/x86/lib/memcpy_64.S)
> 4.379204 GB/Sec (with prefault)
> Routine x86-64-movsq (movsq-based memcpy() in arch/x86/lib/memcpy_64.S)
> 4.264465 GB/Sec (with prefault)
> Routine x86-64-movsb (movsb-based memcpy() in arch/x86/lib/memcpy_64.S)
> 6.554111 GB/Sec (with prefault)
Is this skylake? And why are the numbers so low? Even on my laptop
(Haswell), I get ~21GB/s (when setting cpufreq to performance).
It's interesting that 'movsb' for you is so much better. It's been
promising before, and it *should* be able to do better than manual
copying, but it's not been that noticeable on the machines I've
tested. But I haven't ued Skylake or Broadwell yet.
cpufreq might be making a difference too. Maybe it's just ramping up
the CPU? Or is that really repeatable?
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists