lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 13 May 2022 17:05:45 +0800
From:   Adam Li <adamli@...eremail.onmicrosoft.com>
To:     Leo Yan <leo.yan@...aro.org>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Like Xu <likexu@...cent.com>, Ian Rogers <irogers@...gle.com>,
        Alyssa Ross <hi@...ssa.is>, Kajol Jain <kjain@...ux.ibm.com>,
        Li Huafei <lihuafei1@...wei.com>,
        German Gomez <german.gomez@....com>,
        James Clark <james.clark@....com>,
        Kan Liang <kan.liang@...ux.intel.com>,
        Ali Saidi <alisaidi@...zon.com>,
        linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 9/11] perf c2c: Sort on peer snooping for load
 operations

On 5/8/2022 5:23 PM, Leo Yan wrote:
> Except the existed three display options 'tot', 'rmt', 'lcl', this patch
> adds a new option 'peer' so can sort on the cache hit for peer snooping.
> 
> For displaying with option 'peer', the "Shared Data Cache Line Table" and
> "Shared Cache Line Distribution Pareto" both sort with the metrics
> "ld_peer".  As result, we can get the 'peer' display as below:
> 
>   # perf c2c report -d peer --coalesce tid,pid,iaddr,dso -N --stdio
> 

Hi Leo,

I tested v2 patch on 2P Altra system.
In case the false-sharing data is mainly from remote node, 'Snoop Peers'
cannot indicate severity of false-sharing. As showed in bellow output,
there are only 10 'Load HIT Peer' records, while there are 2353
'Load Remote DRAM' records.

And the name 'Load Remote DRAM' is kind of misleading, since we cannot tell
the data source is 'DRAM'.

Run false_sharing test(https://github.com/joemario/perf-c2c-usage-files):
one lock_th on node 0, one reader_thd on node 1:

# perf c2c record -- numactl -m 0 ./false_sharing.exe 1
131 mticks, reader_thd (thread 1), on node 1 (cpu 80).
145 mticks, lock_th (thread 0), on node 0 (cpu 9).
[ perf record: Woken up 16 times to write data ]
[ perf record: Captured and wrote 33.726 MB perf.data ]


# perf c2c report -d peer --coalesce tid,pid,iaddr,dso -N --stdio
Warning:
Arm SPE CONTEXT packets not found in the traces.
Matching of TIDs to SPE events could be inaccurate.
Warning:
AUX data detected collision  6 times out of 47!

=================================================
            Trace Event Information
=================================================
  Total records                     :     551944
  Locked Load/Store Operations      :          0
  Load Operations                   :     493082
  Loads - uncacheable               :          0
  Loads - IO                        :          0
  Loads - Miss                      :          0
  Loads - no mapping                :          0
  Load Fill Buffer Hit              :          0
  Load L1D hit                      :     490589
  Load L2D hit                      :        117
  Load LLC hit                      :         11
  Load HIT Peer                     :         10
  Load Local HITM                   :          0
  Load Remote HITM                  :          0
  Load Remote HIT                   :          0
  Load Local DRAM                   :          2
  Load Remote DRAM                  :       2353
  Load MESI State Exclusive         :       2355
  Load MESI State Shared            :          0
  Load LLC Misses                   :       2355
  Load access blocked by data       :          0
  Load access blocked by address    :          0
  LLC Misses to Local DRAM          :        0.1%
  LLC Misses to Remote DRAM         :       99.9%
  LLC Misses to Remote cache (HIT)  :        0.0%
  LLC Misses to Remote cache (HITM) :        0.0%
  Store Operations                  :      58862
  Store - uncacheable               :          0
  Store - no mapping                :          0
  Store L1D Hit                     :          0
  Store L1D Miss                    :          0
  Store No available memory level   :      58862
  No Page Map Rejects               :        490
  Unable to parse data source       :          0

=================================================
    Global Shared Cache Line Event Information
=================================================
  Total Shared Cache Lines          :          9
  Load HITs on shared lines         :         21
  Fill Buffer Hits on shared lines  :          0
  L1D hits on shared lines          :          6
  L2D hits on shared lines          :          1
  Load HITs on peer cache lines     :         10
  LLC hits on shared lines          :          0
  Locked Access on shared lines     :          0
  Blocked Access on shared lines    :          0
  Store HITs on shared lines        :          0
  Store L1D hits on shared lines    :          0
  Store No available memory level   :          0
  Total Merged records              :          0

=================================================
                 c2c details
=================================================
  Events                            : arm_spe_0/ts_enable=1,load_filter=1,store_filter=1,min_latency=30/
                                    : dummy:u
                                    : memory
  Cachelines sort on                : Snoop Peers
  Cacheline data grouping           : offset,tid,pid,iaddr,dso

[...]

Thanks,
-adam

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ