[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3c7dcd5d-fddd-5d3b-81ac-cb7b615b0338@intel.com>
Date: Thu, 27 May 2021 11:25:40 +0300
From: Adrian Hunter <adrian.hunter@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Leo Yan <leo.yan@...aro.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Ingo Molnar <mingo@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Namhyung Kim <namhyung@...nel.org>,
Andi Kleen <ak@...ux.intel.com>,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 1/2] perf auxtrace: Change to use SMP memory barriers
On 27/05/21 11:11 am, Peter Zijlstra wrote:
> On Thu, May 27, 2021 at 10:54:56AM +0300, Adrian Hunter wrote:
>> On 19/05/21 5:03 pm, Leo Yan wrote:
>>> The AUX ring buffer's head and tail can be accessed from multiple CPUs
>>> on SMP system, so changes to use SMP memory barriers to replace the
>>> uniprocessor barriers.
>>
>> I don't think user space should attempt to be SMP-aware.
>
> Uhh, what? It pretty much has to. Since userspace cannot assume UP, it
> must assume SMP.
Yeah that is what I meant, but consequently we generally shouldn't be
using functions called smp_<anything>
>
>> For perf tools, on __x86_64__ it looks like smp_rmb() is only a compiler barrier, whereas
>> rmb() is a "lfence" memory barrier instruction, so this patch does not
>> seem to do what the commit message says at least for x86.
>
> The commit message is somewhat confused; *mb() are not UP barriers
> (although they are available and useful on UP). They're device/dma
> barriers.
>
>> With regard to the AUX area, we don't know in general how data gets there,
>> so using memory barriers seems sensible.
>
> IIRC (but I ddn't check) the rule was that the kernel needs to ensure
> the AUX area is complete before it updates the head pointer. So if
> userspace can observe the head pointer, it must then also be able to
> observe the data. This is not something userspace can fix up anyway.
>
> The ordering here is between the head pointer and the data, and from a
> userspace perspective that's a regular smp ordering. Similar for the
> tail update, that's between our reading the data and writing the tail,
> regular cache coherent smp ordering.
>
> So ACK on the patch, it's sane and an optimization for both x86 and ARM.
> Just the Changelog needs work.
If all we want is a compiler barrier, then shouldn't that be what we use?
i.e. barrier()
>
>>> Signed-off-by: Leo Yan <leo.yan@...aro.org>
>>> ---
>>> tools/perf/util/auxtrace.h | 6 +++---
>>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
>>> index 472c0973b1f1..8bed284ccc82 100644
>>> --- a/tools/perf/util/auxtrace.h
>>> +++ b/tools/perf/util/auxtrace.h
>>> @@ -452,7 +452,7 @@ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
>>> u64 head = READ_ONCE(pc->aux_head);
>>>
>>> /* Ensure all reads are done after we read the head */
>>> - rmb();
>>> + smp_rmb();
>>> return head;
>>> }
>>>
>>> @@ -466,7 +466,7 @@ static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
>>> #endif
>>>
>>> /* Ensure all reads are done after we read the head */
>>> - rmb();
>>> + smp_rmb();
>>> return head;
>>> }
>>>
>>> @@ -478,7 +478,7 @@ static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
>>> #endif
>>>
>>> /* Ensure all reads are done before we write the tail out */
>>> - mb();
>>> + smp_mb();
>>> #if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
>>> pc->aux_tail = tail;
>>> #else
>>>
>>
Powered by blists - more mailing lists