[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YK9UFl7Grv/mwmVc@hirez.programming.kicks-ass.net>
Date: Thu, 27 May 2021 10:11:02 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Adrian Hunter <adrian.hunter@...el.com>
Cc: Leo Yan <leo.yan@...aro.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Ingo Molnar <mingo@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Namhyung Kim <namhyung@...nel.org>,
Andi Kleen <ak@...ux.intel.com>,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 1/2] perf auxtrace: Change to use SMP memory barriers
On Thu, May 27, 2021 at 10:54:56AM +0300, Adrian Hunter wrote:
> On 19/05/21 5:03 pm, Leo Yan wrote:
> > The AUX ring buffer's head and tail can be accessed from multiple CPUs
> > on SMP system, so changes to use SMP memory barriers to replace the
> > uniprocessor barriers.
>
> I don't think user space should attempt to be SMP-aware.
Uhh, what? It pretty much has to. Since userspace cannot assume UP, it
must assume SMP.
> For perf tools, on __x86_64__ it looks like smp_rmb() is only a compiler barrier, whereas
> rmb() is a "lfence" memory barrier instruction, so this patch does not
> seem to do what the commit message says at least for x86.
The commit message is somewhat confused; *mb() are not UP barriers
(although they are available and useful on UP). They're device/dma
barriers.
> With regard to the AUX area, we don't know in general how data gets there,
> so using memory barriers seems sensible.
IIRC (but I ddn't check) the rule was that the kernel needs to ensure
the AUX area is complete before it updates the head pointer. So if
userspace can observe the head pointer, it must then also be able to
observe the data. This is not something userspace can fix up anyway.
The ordering here is between the head pointer and the data, and from a
userspace perspective that's a regular smp ordering. Similar for the
tail update, that's between our reading the data and writing the tail,
regular cache coherent smp ordering.
So ACK on the patch, it's sane and an optimization for both x86 and ARM.
Just the Changelog needs work.
> > Signed-off-by: Leo Yan <leo.yan@...aro.org>
> > ---
> > tools/perf/util/auxtrace.h | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
> > index 472c0973b1f1..8bed284ccc82 100644
> > --- a/tools/perf/util/auxtrace.h
> > +++ b/tools/perf/util/auxtrace.h
> > @@ -452,7 +452,7 @@ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
> > u64 head = READ_ONCE(pc->aux_head);
> >
> > /* Ensure all reads are done after we read the head */
> > - rmb();
> > + smp_rmb();
> > return head;
> > }
> >
> > @@ -466,7 +466,7 @@ static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
> > #endif
> >
> > /* Ensure all reads are done after we read the head */
> > - rmb();
> > + smp_rmb();
> > return head;
> > }
> >
> > @@ -478,7 +478,7 @@ static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
> > #endif
> >
> > /* Ensure all reads are done before we write the tail out */
> > - mb();
> > + smp_mb();
> > #if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
> > pc->aux_tail = tail;
> > #else
> >
>
Powered by blists - more mailing lists