lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fVqjZOvncE3iTAF6Wfqrn3_UxGsrBJkiaT=qMs5xdq9LA@mail.gmail.com>
Date:   Thu, 2 Jun 2022 09:59:29 -0700
From:   Ian Rogers <irogers@...gle.com>
To:     Leo Yan <leo.yan@...aro.org>
Cc:     Joe Mario <jmario@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>, Alyssa Ross <hi@...ssa.is>,
        Like Xu <likexu@...cent.com>, Kajol Jain <kjain@...ux.ibm.com>,
        Li Huafei <lihuafei1@...wei.com>,
        Adam Li <adam.li@...erecomputing.com>,
        German Gomez <german.gomez@....com>,
        James Clark <james.clark@....com>,
        Ali Saidi <alisaidi@...zon.com>,
        linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 00/12] perf c2c: Support display for Arm64

On Wed, Jun 1, 2022 at 3:25 AM Leo Yan <leo.yan@...aro.org> wrote:
>
> Hi Joe,
>
> On Tue, May 31, 2022 at 02:44:07PM -0400, Joe Mario wrote:
>
> [...]
>
> > Hi Leo:
> > I built a new perf with your patches and ran it on a 2-numa node Neoverse platform.
> > I then ran my simple test that creates reader and writer threads to tug on the same cacheline.
> > The c2c output is appended below.
> >
> > The output looks good, especially where you've broken out the (average) cycles for local and remote peer loads.
> > And I'm glad to see you fixed the "Node" column.  I use that a lot to help detect remote node accesses.
>
> Thanks a lot for your testing and suggestions, which are really helpful!
>
> > And the "PA cnt" field is working as well,  which is important to see if numa_balance is moving the data around.
>
> Good to know this.  To be honest, before I didn't note for "PA cnt"
> metric, I checked a bit for the code, this metrics is very useful to
> understand how it's severe that a cache line is accessed from different
> addresses, so we can get sense how a cache line is hammered.
>
> > =================================================
> >            Shared Data Cache Line Table
> > =================================================
> > #
> > #        ----------- Cacheline ----------     Peer  ------- Load Peer -------    Total    Total    Total  --------- Stores --------  ----- Core Load Hit -----  - LLC Load Hit --  - RMT Load Hit --  --- Load Dram ----
> > # Index             Address  Node  PA cnt    Snoop    Total    Local   Remote  records    Loads   Stores    L1Hit   L1Miss      N/A       FB       L1       L2    LclHit  LclHitm    RmtHit  RmtHitm       Lcl       Rmt
> > # .....  ..................  ....  ......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  ........  .......  ........  .......  ........  ........
> > #
> >       0            0x422140     0    6904   74.86%      137      131        6   148008   144970     3038        0        0     3038        0   144833      120        11        0         6        0         0         0
> >       1  0xffffd976e63ae5c0     1       6    3.83%        7        7        0       15       15        0        0        0        0        0        8        4         3        0         0        0         0         0
> >       2  0xffff07ffbf290980     0       5    2.19%        4        2        2       14       14        0        0        0        0        0       10        1         1        0         2        0         0         0
> >       3  0xffffd976e57275c0     1       1    0.55%        1        1        0        1        1        0        0        0        0        0        0        1         0        0         0        0         0         0
> >       4  0xffffd976e6071c00     1       3    0.55%        1        0        1        4        4        0        0        0        0        0        3        0         0        0         1        0         0         0
> >      [snip]
> > =================================================
> >       Shared Cache Line Distribution Pareto
> > =================================================
> > #
> > #        -- Peer Snoop --  ------- Store Refs ------  --------- Data address ---------                      ---------- cycles ----------    Total       cpu                               Shared
> > #   Num      Rmt      Lcl   L1 Hit  L1 Miss      N/A              Offset  Node  PA cnt        Code address  rmt peer  lcl peer      load  records       cnt                      Symbol   Object                Source:Line  Node
> > # .....  .......  .......  .......  .......  .......  ..................  ....  ......  ..................  ........  ........  ........  .......  ........  ..........................  .......  .........................  ....
> > #
> >   ----------------------------------------------------------------------
> >       0        6      131        0        0     3038            0x422140
> >   ----------------------------------------------------------------------
> >            0.00%    0.00%    0.00%    0.00%   52.60%                 0x8     0       1            0x400e6c         0         0         0     1598         4  [.] writer                  tugtest  tugtest.c:152               0 1
> >            0.00%    0.00%    0.00%    0.00%   47.40%                0x10     0       1            0x400e7c         0         0         0     1440         4  [.] writer                  tugtest  tugtest.c:153               0 1
> >           33.33%   75.57%    0.00%    0.00%    0.00%                0x20     0       1            0x401018      4095      3803      3419      409         4  [.] reader                  tugtest  tugtest.c:187               0 1
> >           66.67%   24.43%    0.00%    0.00%    0.00%                0x28     0       1            0x401034      4095      3470      3643      413         4  [.] reader                  tugtest  tugtest.c:187               0 1
> >
> >   ----------------------------------------------------------------------
> >       1        0        7        0        0        0  0xffffd976e63ae5c0
> >   ----------------------------------------------------------------------
> >            0.00%   57.14%    0.00%    0.00%    0.00%                 0x0     1       1  0xffffd976e4815fbc         0      1333         0        4         2  [k] ktime_get                   [kernel.kallsyms]  seqlock.h:276          1
> >            0.00%   14.29%    0.00%    0.00%    0.00%                 0x0     1       1  0xffffd976e4816d10         0       266       794        4         3  [k] ktime_get_update_offsets_n  [kernel.kallsyms]  seqlock.h:276        0 1
> >            0.00%   28.57%    0.00%    0.00%    0.00%                0x30     1       1  0xffffd976e4816d20         0        87       150        4         3  [k] ktime_get_update_offsets_n  [kernel.kallsyms]  timekeeping.c:2298   0 1
> >
> >   ----------------------------------------------------------------------
> >       2        2        2        0        0        0  0xffff07ffbf290980
> >   ----------------------------------------------------------------------
> >           50.00%  100.00%    0.00%    0.00%    0.00%                 0x4     0       1  0xffffd976e47d2bdc      1217      1600      1147        4         3  [k] queued_spin_lock_slowpath  [kernel.kallsyms]  qspinlock.c:511    0 1
> >           50.00%    0.00%    0.00%    0.00%    0.00%                 0x4     0       1  0xffffd976e47d2a2c      4033         0         0        1         1  [k] queued_spin_lock_slowpath  [kernel.kallsyms]  qspinlock.c:382    0 1
> >
> >   ----------------------------------------------------------------------
> >
> > Thanks for doing this.  It looks good.
>
> You are welcome!  And very appreicate your helping to mature the code.
>
> > I'll assume someone else is reviewing your code changes.
>
> Yeah, let's give a bit more time for reviewing.
>
> Thanks,
> Leo

This is great Leo! I've not been able to test the changes but I didn't
have any coding comments (happy to give an Acked-by). Do you think we
can add a test for this? The test can skip when c2c isn't supported.

Thanks,
Ian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ