[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20200812122529.GH13995@kernel.org>
Date: Wed, 12 Aug 2020 09:25:29 -0300
From: Arnaldo Carvalho de Melo <acme@...nel.org>
To: Vincent Whitchurch <vincent.whitchurch@...s.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung@...nel.org>, kernel@...s.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] perf bench mem: Always memset source before memcpy
Em Mon, Aug 10, 2020 at 03:34:04PM +0200, Vincent Whitchurch escreveu:
> For memcpy, the source pages are memset to zero only when --cycles is
> used. This leads to wildly different results with or without --cycles,
> since all sources pages are likely to be mapped to the same zero page
> without explicit writes.
Thanks, applied.
- Arnaldo
> Before this fix:
>
> $ export cmd="./perf stat -e LLC-loads -- ./perf bench \
> mem memcpy -s 1024MB -l 100 -f default"
> $ $cmd
>
> 2,935,826 LLC-loads
> 3.821677452 seconds time elapsed
>
> $ $cmd --cycles
>
> 217,533,436 LLC-loads
> 8.616725985 seconds time elapsed
>
> After this fix:
>
> $ $cmd
>
> 214,459,686 LLC-loads
> 8.674301124 seconds time elapsed
>
> $ $cmd --cycles
>
> 214,758,651 LLC-loads
> 8.644480006 seconds time elapsed
>
> Fixes: 47b5757bac03c3387c ("perf bench mem: Move boilerplate memory allocation to the infrastructure")
> Signed-off-by: Vincent Whitchurch <vincent.whitchurch@...s.com>
> ---
> tools/perf/bench/mem-functions.c | 21 +++++++++++----------
> 1 file changed, 11 insertions(+), 10 deletions(-)
>
> diff --git a/tools/perf/bench/mem-functions.c b/tools/perf/bench/mem-functions.c
> index 9235b76501be..19d45c377ac1 100644
> --- a/tools/perf/bench/mem-functions.c
> +++ b/tools/perf/bench/mem-functions.c
> @@ -223,12 +223,8 @@ static int bench_mem_common(int argc, const char **argv, struct bench_mem_info *
> return 0;
> }
>
> -static u64 do_memcpy_cycles(const struct function *r, size_t size, void *src, void *dst)
> +static void memcpy_prefault(memcpy_t fn, size_t size, void *src, void *dst)
> {
> - u64 cycle_start = 0ULL, cycle_end = 0ULL;
> - memcpy_t fn = r->fn.memcpy;
> - int i;
> -
> /* Make sure to always prefault zero pages even if MMAP_THRESH is crossed: */
> memset(src, 0, size);
>
> @@ -237,6 +233,15 @@ static u64 do_memcpy_cycles(const struct function *r, size_t size, void *src, vo
> * to not measure page fault overhead:
> */
> fn(dst, src, size);
> +}
> +
> +static u64 do_memcpy_cycles(const struct function *r, size_t size, void *src, void *dst)
> +{
> + u64 cycle_start = 0ULL, cycle_end = 0ULL;
> + memcpy_t fn = r->fn.memcpy;
> + int i;
> +
> + memcpy_prefault(fn, size, src, dst);
>
> cycle_start = get_cycles();
> for (i = 0; i < nr_loops; ++i)
> @@ -252,11 +257,7 @@ static double do_memcpy_gettimeofday(const struct function *r, size_t size, void
> memcpy_t fn = r->fn.memcpy;
> int i;
>
> - /*
> - * We prefault the freshly allocated memory range here,
> - * to not measure page fault overhead:
> - */
> - fn(dst, src, size);
> + memcpy_prefault(fn, size, src, dst);
>
> BUG_ON(gettimeofday(&tv_start, NULL));
> for (i = 0; i < nr_loops; ++i)
> --
> 2.25.1
>
--
- Arnaldo
Powered by blists - more mailing lists