[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161019181308.maacqqzdx4ep5yld@linutronix.de>
Date: Wed, 19 Oct 2016 20:13:08 +0200
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Davidlohr Bueso <dave@...olabs.net>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
Davidlohr Bueso <dbueso@...e.de>
Subject: Re: [PATCH] perf/bench-futex: Avoid worker cacheline bouncing
On 2016-10-19 10:59:33 [-0700], Davidlohr Bueso wrote:
> Sebastian noted that overhead for worker thread ops (throughput)
> accounting was producing 'perf' to appear in the profiles, consuming
> a non-trivial (ie 13%) amount of CPU. This is due to cacheline
> bouncing due to the increment of w->ops. We can easily fix this by
> just working on a local copy and updating the actual worker once
> done running, and ready to show the program summary. There is no
> danger of the worker being concurrent, so we can trust that no stale
> value is being seen by another thread.
>
> Reported-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Acked-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
> --- a/tools/perf/bench/futex-hash.c
> +++ b/tools/perf/bench/futex-hash.c
> @@ -63,8 +63,9 @@ static const char * const bench_futex_hash_usage[] = {
> static void *workerfn(void *arg)
> {
> int ret;
> - unsigned int i;
> struct worker *w = (struct worker *) arg;
> + unsigned int i;
> + unsigned long ops = w->ops; /* avoid cacheline bouncing */
we start at 0 so there is probably no need to init it with w->ops.
Sebastian
Powered by blists - more mailing lists