lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b21bf42e-377d-36d0-49c3-af1e4edf5496@linux.ibm.com>
Date:   Fri, 17 Sep 2021 12:07:50 +0530
From:   Madhavan Srinivasan <maddy@...ux.ibm.com>
To:     Stephane Eranian <eranian@...gle.com>,
        Michael Ellerman <mpe@...erman.id.au>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org, acme@...hat.com, jolsa@...hat.com,
        kim.phillips@....com, namhyung@...nel.org, irogers@...gle.com,
        atrajeev@...ux.vnet.ibm.com,
        Arnaldo Carvalho de Melo <acme@...nel.org>
Subject: Re: [PATCH v1 01/13] perf/core: add union to struct perf_branch_entry


On 9/15/21 11:33 AM, Stephane Eranian wrote:
> Michael,
>
>
> On Fri, Sep 10, 2021 at 7:16 AM Michael Ellerman <mpe@...erman.id.au> wrote:
>> Michael Ellerman <mpe@...erman.id.au> writes:
>>> Peter Zijlstra <peterz@...radead.org> writes:
>>>> On Thu, Sep 09, 2021 at 12:56:48AM -0700, Stephane Eranian wrote:
>>>>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>>>>> index f92880a15645..eb11f383f4be 100644
>>>>> --- a/include/uapi/linux/perf_event.h
>>>>> +++ b/include/uapi/linux/perf_event.h
>>>>> @@ -1329,13 +1329,18 @@ union perf_mem_data_src {
>>>>>   struct perf_branch_entry {
>>>>>      __u64   from;
>>>>>      __u64   to;
>>>>> -   __u64   mispred:1,  /* target mispredicted */
>>>>> -           predicted:1,/* target predicted */
>>>>> -           in_tx:1,    /* in transaction */
>>>>> -           abort:1,    /* transaction abort */
>>>>> -           cycles:16,  /* cycle count to last branch */
>>>>> -           type:4,     /* branch type */
>>>>> -           reserved:40;
>>>>> +   union {
>>>>> +           __u64   val;        /* to make it easier to clear all fields */
>>>>> +           struct {
>>>>> +                   __u64   mispred:1,  /* target mispredicted */
>>>>> +                           predicted:1,/* target predicted */
>>>>> +                           in_tx:1,    /* in transaction */
>>>>> +                           abort:1,    /* transaction abort */
>>>>> +                           cycles:16,  /* cycle count to last branch */
>>>>> +                           type:4,     /* branch type */
>>>>> +                           reserved:40;
>>>>> +           };
>>>>> +   };
>>>>>   };
>>>>
>>>> Hurpmh... all other bitfields have ENDIAN_BITFIELD things except this
>>>> one. Power folks, could you please have a look?
>>> The bit number of each field changes between big and little endian, but
>>> as long as kernel and userspace are the same endian, and both only
>>> access values via the bitfields then it works.
>> ...
>>> It does look like we have a bug in perf tool though, if I take a
>>> perf.data from a big endian system to a little endian one I don't see
>>> any of the branch flags decoded. eg:
>>>
>>> BE:
>>>
>>> 2413132652524 0x1db8 [0x2d0]: PERF_RECORD_SAMPLE(IP, 0x1): 5279/5279: 0xc00000000045c028 period: 923003 addr: 0
>>> ... branch stack: nr:28
>>> .....  0: c00000000045c028 -> c00000000dce7604 0 cycles  P   0
>>>
>>> LE:
>>>
>>> 2413132652524 0x1db8 [0x2d0]: PERF_RECORD_SAMPLE(IP, 0x1): 5279/5279: 0xc00000000045c028 period: 923003 addr: 0
>>> ... branch stack: nr:28
>>> .....  0: c00000000045c028 -> c00000000dce7604 0 cycles      0
>>>                                                           ^
>>>                                                           missing P
>>>
>>> I guess we're missing a byte swap somewhere.
>> Ugh. We _do_ have a byte swap, but we also need a bit swap.
>>
>> That works for the single bit fields, not sure if it will for the
>> multi-bit fields.
>>
>> So that's a bit of a mess :/
>>
> Based on what I see in perf_event.h for other structures, I think I
> can make up what you would need for struct branch_entry. But Iit would
> be easier if you could send me a patch that you would have verified on
> your systems.
> Thanks.
Attached patch fixes the issue. Have tested both in both in BE and LE case.

Maddy

 From f816ba2e6ef8d5975f78442d7ecb50d66c3c4326 Mon Sep 17 00:00:00 2001
From: Madhavan Srinivasan <maddy@...ux.ibm.com>
Date: Wed, 15 Sep 2021 22:29:09 +0530
Subject: [RFC PATCH] tools/perf: Add reverse_64b macro

branch_stack struct has bit field definition
producing different bit ordering for big/little endian.
Because of this, when branch_stack sample collected
in a BE system viewed/reported in a LE system,
bit fields of the branch stack are not presented
properly. To address this issue, a reverse_64b
macro is defined and introduced in evsel__parse_sample.

Signed-off-by: Madhavan Srinivasan <maddy@...ux.ibm.com>
---
  tools/perf/util/evsel.c | 35 +++++++++++++++++++++++++++++++++--
  1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index dbfeceb2546c..3151606e516e 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2221,6 +2221,9 @@ void __weak arch_perf_parse_sample_weight(struct 
perf_sample *data,
      data->weight = *array;
  }

+#define reverse_64b(src, pos, size)    \
+    (((src >> pos) & (( 1ull <<size) - 1)) << (63 - (pos + size - 1)))
+
  int evsel__parse_sample(struct evsel *evsel, union perf_event *event,
              struct perf_sample *data)
  {
@@ -2408,6 +2411,8 @@ int evsel__parse_sample(struct evsel *evsel, union 
perf_event *event,
      if (type & PERF_SAMPLE_BRANCH_STACK) {
          const u64 max_branch_nr = UINT64_MAX /
                        sizeof(struct branch_entry);
+        struct branch_entry *e;
+        unsigned i;

          OVERFLOW_CHECK_u64(array);
          data->branch_stack = (struct branch_stack *)array++;
@@ -2416,10 +2421,36 @@ int evsel__parse_sample(struct evsel *evsel, 
union perf_event *event,
              return -EFAULT;

          sz = data->branch_stack->nr * sizeof(struct branch_entry);
-        if (evsel__has_branch_hw_idx(evsel))
+        if (evsel__has_branch_hw_idx(evsel)) {
              sz += sizeof(u64);
-        else
+            e = &data->branch_stack->entries[0];
+        } else {
              data->no_hw_idx = true;
+            e = (struct branch_entry *)&data->branch_stack->hw_idx;
+        }
+
+        if (swapped) {
+            for (i = 0; i < data->branch_stack->nr; i++, e++) {
+                u64 new_val = 0;
+
+                /* mispred:1  target mispredicted */
+                new_val = reverse_64b(e->flags.value, 0, 1);
+                /* predicted:1  target predicted */
+                new_val |= reverse_64b(e->flags.value, 1, 1);
+                /* in_tx:1  in transaction */
+                new_val |= reverse_64b(e->flags.value, 2, 1);
+                /* abort:1  transaction abort */
+                new_val |= reverse_64b(e->flags.value, 3, 1);
+                /* cycles:16  cycle count to last branch */
+                new_val |= reverse_64b(e->flags.value, 4, 16);
+                /* type:4  branch type */
+                new_val |= reverse_64b(e->flags.value, 20, 4);
+                /* reserved:40 */
+                new_val |= reverse_64b(e->flags.value, 24, 40);
+                e->flags.value = new_val;
+            }
+        }
+
          OVERFLOW_CHECK(array, sz, max_size);
          array = (void *)array + sz;
      }
-- 
2.31.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ