[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <38dee054-19ce-a545-fb62-dea4d0036c94@arm.com>
Date:   Fri, 19 Apr 2019 22:18:02 +0100
From:   Robin Murphy <robin.murphy@....com>
To:     Willy Wolff <willy.mh.wolff.ml@...il.com>,
        Rob Herring <robh+dt@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Kukjin Kim <kgene@...nel.org>,
        Krzysztof Kozlowski <krzk@...nel.org>,
        devicetree@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-samsung-soc@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] ARM: dts: exynos: add CCI-400 PMU nodes support to
 Exynos542x SoCs
On 2019-04-19 6:53 pm, Willy Wolff wrote:
> Hi,
> 
> This patch can be dropped, as it needs more work.
> 
> In fact, the interrupts seems to be wrong. The interrupts suggested by
> Anand Moon gave the same following results.
> 
> export CCI_DEV=CCI_400
> export OMP_NUM_THREADS=2
> sudo --preserve-env ./perf stat -a \
>    -e armv7_cortex_a7/config=0x11,name=a7_cycles/ \
>    -e armv7_cortex_a15/config=0x11,name=a15_cycles/ \
>    -e armv7_cortex_a7/config=0x19,name=a7_bus/ \
>    -e armv7_cortex_a15/config=0x19,name=a15_bus/ \
>    -e ${CCI_DEV}/config=0xff,name=cci400_cycles/ \
>    -e ${CCI_DEV}/config=0x0,name=cci400_si_rrq_hs_any/ \
>    -e ${CCI_DEV}/config=0xc,name=cci400_si_wrq_hs_any/ \
 From the look of those configs, you'll be counting events on slave 
interface 0, which may not even have anything connected anyway. The CPU 
clusters on a CCI-400 will be on slave interfaces 3 and 4, so try 
something like '-e CCI_400/cci400_si_rrq_hs_any,source=4/'.
The interrupts only matter for counter overflow, so confirming those 
could be done by picking a sufficiently frequent event, counting for 
long enough to capture slightly more than 2^32 of those, then seeing 
whether the overflow accumulates correctly or the count appears to go 
backwards (and/or checking what fired in /proc/interrupts). I believe 
the cycle counter is also 32-bit on CCI, so that should be relatively 
easy; for the other counters beyond the first one it should be feasible 
to schedule additional dummy events before the event of interest in 
order to trick pmu_get_event_idx() into allocating the desired counter 
for it.
Robin.
>    taskset -c 0,7 /home/user/cg.x.A 1
> 
> [..]
> 
>   Performance counter stats for 'system wide':
> 
>       9,362,850,550      a7_cycles
>       1,682,125,760      a15_cycles
>          68,920,347      a7_bus
>          61,484,352      a15_bus
>       3,789,936,935      cci400_cycles
>                   0      cci400_si_rrq_hs_any
>                   0      cci400_si_wrq_hs_any
> 
>         9.541340558 seconds time elapsed
> 
> cg.x.A comes from NAS benchmark suite, compiled with fopenmp support, setup
> to run 2 threads and taskmapped to ran on both a7 and a15 clusters.
> a7_bus and a15_bus report main memory accesses.
> 
> Only cci400_cycles seems to be correct. However, all pmcs from the master
> interface are reported as unsupported and all pmcs from the slave interface
> return 0, which is probably not correct.
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0470f/CJHICFBF.html
> 
> Would it be possible that someone from Samsung provide the right
> interrupts values?
> Many thanks.
> 
> Regards,
> Willy
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@...ts.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
Powered by blists - more mailing lists