lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 19 May 2022 17:06:18 +0800
From:   Adam Li <adamli@...amperecomputing.com>
To:     Leo Yan <leo.yan@...aro.org>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Like Xu <likexu@...cent.com>, Ian Rogers <irogers@...gle.com>,
        Alyssa Ross <hi@...ssa.is>, Kajol Jain <kjain@...ux.ibm.com>,
        Li Huafei <lihuafei1@...wei.com>,
        German Gomez <german.gomez@....com>,
        James Clark <james.clark@....com>,
        Kan Liang <kan.liang@...ux.intel.com>,
        Ali Saidi <alisaidi@...zon.com>,
        linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 9/11] perf c2c: Sort on peer snooping for load
 operations

Hi Leo,

Thanks for the update.
On 5/18/2022 2:12 PM, Leo Yan wrote:
 
> Please note, in the total statistics, all remote accesses will be
> accounted into metric "rmt_hit", so "rmt_hit" includes the access for
> remote DRAM or any upwards cache levels due we cannot distinguish
> them.
>

Agree that "Load Remote HIT" makes more sense than "Load Remote DRAM".
 
> From my experiment, with this updating the output result is promised
> for the peer accesses and it's easier for inspecting false sharing.
> 
> As you might see I have prepared a git repo:
> https://git.linaro.org/people/leo.yan/linux-spe.git/ branch:
> perf_c2c_arm_spe_peer_v3, which contains the updated patches for both
> memory flag setting and perf c2c related patches.
> 
> Could you confirm if the updated code works for you or not?
> 

I tested v3 patch (perf_c2c_arm_spe_peer_v3 branch) on 2P Altra system.

Compared with v2, "Snoop Peer" can better indicate cache false-sharing,
for the 'false_sharing.exe' test case.

Bellow are details:

# perf c2c record -- numactl -m 0 ./false_sharing.exe 2
183 mticks, reader_thd (thread 2), on node 0 (cpu 78).
195 mticks, reader_thd (thread 3), on node 1 (cpu 124).
546 mticks, lock_th (thread 0), on node 0 (cpu 0).
562 mticks, lock_th (thread 1), on node 1 (cpu 123).
[ perf record: Woken up 36 times to write data ]
[ perf record: Captured and wrote 72.440 MB perf.data ]

# perf c2c report -d peer --coalesce tid,pid,iaddr,dso -N --stdio
Warning:
Arm SPE CONTEXT packets not found in the traces.
Matching of TIDs to SPE events could be inaccurate.
Warning:
AUX data detected collision  20 times out of 168!

=================================================
  Total records                     :    1198728
  Locked Load/Store Operations      :          0
  Load Operations                   :    1031196
  Loads - uncacheable               :          0
  Loads - IO                        :          0
  Loads - Miss                      :          0
  Loads - no mapping                :          0
  Load Fill Buffer Hit              :          0
  Load L1D hit                      :     970636
  Load L2D hit                      :        292
  Load LLC hit                      :       2477
  Load Local HITM                   :          0
  Load Remote HITM                  :          0
  Load Remote HIT                   :      56459
  Load Local DRAM                   :       1332
  Load Remote DRAM                  :          0
  Load MESI State Exclusive         :       1332
  Load MESI State Shared            :          0
  Load LLC Misses                   :      57791
  Load access blocked by data       :          0
  Load access blocked by address    :          0
  Load HIT Peer                     :      58814
  LLC Misses to Local DRAM          :        2.3%
  LLC Misses to Remote DRAM         :        0.0%
  LLC Misses to Remote cache (HIT)  :       97.7%
  LLC Misses to Remote cache (HITM) :        0.0%
  Store Operations                  :     167532
  Store - uncacheable               :          0
  Store - no mapping                :          0
  Store L1D Hit                     :          0
  Store L1D Miss                    :          0
  Store No available memory level   :     167532
  No Page Map Rejects               :       1234
  Unable to parse data source       :          0

=================================================
    Global Shared Cache Line Event Information
=================================================
  Total Shared Cache Lines          :         45
  Load HITs on shared lines         :     226254
  Fill Buffer Hits on shared lines  :          0
  L1D hits on shared lines          :     166010
  L2D hits on shared lines          :          4
  Load HITs on peer cache lines     :      58814
  LLC hits on shared lines          :       2455
  Locked Access on shared lines     :          0
  Blocked Access on shared lines    :          0
  Store HITs on shared lines        :      96403
  Store L1D hits on shared lines    :          0
  Store No available memory level   :      96403
  Total Merged records              :      96403

=================================================
                 c2c details
=================================================
  Events                            : arm_spe_0/ts_enable=1,load_filter=1,store_filter=1,min_latency=30/
                                    : dummy:u
                                    : memory
  Cachelines sort on                : Snoop Peers
  Cacheline data grouping           : offset,tid,pid,iaddr,dso

=================================================
           Shared Data Cache Line Table
=================================================
#
#        ----------- Cacheline ----------    Snoop  ------- Load Hitm -------    Snoop    Total    Total    Total  --------- Stores --------  ----- Core Load Hit -----  - LLC Load Hit --  - RMT Load Hit --  --- Load Dram ----
# Index             Address  Node  PA cnt     Peer    Total  LclHitm  RmtHitm     Peer  records    Loads   Stores    L1Hit   L1Miss      N/A       FB       L1       L2    LclHit  LclHitm    RmtHit  RmtHitm       Lcl       Rmt
# .....  ..................  ....  ......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  ........  .......  ........  .......  ........  ........
#
      0            0x420180   N/A       0   95.53%        0        0        0    56183   246056   219522    26534        0        0    26534        0   161914        0       106        0     56176        0      1326         0
      1            0x420100   N/A       0    4.37%        0        0        0     2571    76437     6576    69861        0        0    69861        0     4005        0      2335        0       236        0         0         0
[...]

Thanks,
-adam

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ