[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <b578ae21-0f41-65da-7ec8-05f77edb31be@linux.vnet.ibm.com>
Date: Thu, 16 Mar 2017 11:17:48 +0530
From: Madhavan Srinivasan <maddy@...ux.vnet.ibm.com>
To: Michael Ellerman <mpe@...erman.id.au>,
Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Wang Nan <wangnan0@...wei.com>,
Alexei Starovoitov <ast@...nel.org>,
Stephane Eranian <eranian@...gle.com>
Subject: Re: [PATCH v2 1/6] powerpc/perf: Define big-endian version of
perf_mem_data_src
On Wednesday 15 March 2017 11:50 AM, Michael Ellerman wrote:
> Hi Peter,
>
> Peter Zijlstra <peterz@...radead.org> writes:
>> On Tue, Mar 14, 2017 at 02:31:51PM +0530, Madhavan Srinivasan wrote:
>>
>>>> Huh? PPC hasn't yet implemented this? Then why are you fixing it?
>>> yes, PPC hasn't implemented this (until now).
>> until now where?
> On powerpc there is currently no kernel support for filling the data_src
> value with anything meaningful.
>
> A user can still request PERF_SAMPLE_DATA_SRC (perf report -d), but they
> just get the default value from perf_sample_data_init(), which is
> PERF_MEM_NA.
>
> Though even that is currently broken with a big endian perf tool.
>
>>> And did not understand "Then why are you fixing it?"
>> I see no implementation; so why are you poking at it.
> Maddy has posted an implementation of the kernel part for powerpc in
> patch 2 of this series, but maybe you're not on Cc?
Sorry, was out yesterday.
Yes my bad. I CCed lkml and ppcdev and took the emails
from get_maintainer script and added to each file.
I will send out a v3 with peterz and others in all patch.
>
> Regardless of us wanting to do the kernel side on powerpc, the current
> API is broken on big endian.
>
> That's because in the kernel the PERF_MEM_NA value is constructed using
> shifts:
>
> /* TLB access */
> #define PERF_MEM_TLB_NA 0x01 /* not available */
> ...
> #define PERF_MEM_TLB_SHIFT 26
>
> #define PERF_MEM_S(a, s) \
> (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
>
> #define PERF_MEM_NA (PERF_MEM_S(OP, NA) |\
> PERF_MEM_S(LVL, NA) |\
> PERF_MEM_S(SNOOP, NA) |\
> PERF_MEM_S(LOCK, NA) |\
> PERF_MEM_S(TLB, NA))
>
> Which works out as:
>
> ((0x01 << 0) | (0x01 << 5) | (0x01 << 19) | (0x01 << 24) | (0x01 << 26))
>
>
> Which means the PERF_MEM_NA value comes out of the kernel as 0x5080021
> in CPU endian.
>
> But then in the perf tool, the code uses the bitfields to inspect the
> value, and currently the bitfields are defined using little endian
> ordering.
>
> So eg. in perf_mem__tlb_scnprintf() we see:
> data_src->val = 0x5080021
> op = 0x0
> lvl = 0x0
> snoop = 0x0
> lock = 0x0
> dtlb = 0x0
> rsvd = 0x5080021
>
>
> So this patch does what I think is the minimal fix, of changing the
> definition of the bitfields to match the values that are already
> exported by the kernel on big endian. And it makes no change on little
> endian.
Thanks for the detailed explanation. I will add this to the patch
commit message in the v3.
Maddy
>
> cheers
>
Powered by blists - more mailing lists