[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1892ab83960fbdcdbb49c141577f2c11@www.loen.fr>
Date: Wed, 04 Dec 2019 11:38:15 +0000
From: Marc Zyngier <maz@...nel.org>
To: Robin Murphy <robin.murphy@....com>
Cc: Andreas Färber <afaerber@...e.de>,
Wang YanQing <udknight@...il.com>,
Mark Rutland <mark.rutland@....com>,
<linux-realtek-soc@...ts.infradead.org>,
Will Deacon <will.deacon@....com>,
<linux-kernel@...r.kernel.org>, <linux-soc@...r.kernel.org>,
<linux-arm-kernel@...ts.infradead.org>
Subject: Re: perf record doesn't work on rtd129x SoC
On 2019-12-04 11:20, Robin Murphy wrote:
> On 2019-12-04 7:28 am, Andreas Färber wrote:
>> Hi YanQing,
>> + LAKML + Mark + Will
>> Am 04.12.19 um 05:55 schrieb Wang YanQing:
>>> I use "perf record" to debug performance issue on RTD1296 SOC, it
>>> does't work, but
>>> the "perf stat" is ok!
>> Thanks for the report - which board, branch and (base) tag are you
>> testing against? And are you building perf yourself from kernel
>> sources,
>> or are you using some distro package?
>> I only have Busybox in my initrd on DS418; I have not tested perf.
>>
>>> After some dig in the kernel, I find the reason is no pmu overflow
>>> interrupt, I think
>>> below pmu configuration isn't right for RTD1296:
>>> "
>>> arm_pmu: arm-pmu {
>>> compatible = "arm,cortex-a53-pmu";
>>> interrupts = <GIC_SPI 48 IRQ_TYPE_LEVEL_HIGH>;
>>> };
>>> "
>>>
>>> We need 4 PMU SPI for RTD1296 (4 cores), and I guess the 48 isn't
>>> right too.
>> Note that above rtd129x.dtsi snippet is not complete. See
>> rtd1296.dtsi:
>> &arm_pmu {
>> interrupt-affinity = <&cpu0>, <&cpu1>, <&cpu2>, <&cpu3>;
>> };
>
> That doesn't help much, since 4 affinities for one SPI is rather
> nonsensical.
>
>> 48 and high/4 match what I see in the latest BSP:
>>
>> https://github.com/BPI-SINOVOIP/BPI-M4-bsp/blob/master/linux-rtk/arch/arm64/boot/dts/realtek/rtd129x/rtd-1296.dtsi#L116
>>
>>> Any suggestion is welcome.
>>>
>>> Thanks!
>> The only difference I see is "arm,cortex-a53-pmu" vs.
>> "arm,armv8-pmuv3".
>> By my reading of arch/arm64/kernel/perf_event.c the only difference
>> between the two should be the name and an extra cache_map. You could
>> try
>> the other compatible string in your .dts, but I doubt it'll help.
>> Hopefully the Realtek or Arm guys can shed some light.
>
> If the SoC really has all 4 overflow interrupts combined into a
> single SPI line, then sampling just isn't going to be supported -
> it's
> unreasonably difficult to handle overflow when the IRQ may be taken
> on
> the wrong CPU.
Indeed. And I've recently found this exact design blunder on a brand
new
Amlogic SoC, where the per-core interrupts have been OR'd together.
And not just for the PMU! It is the same situation for the GIC, CTI,
and a couple of other things. The only sane interrupts are the timers.
(sound of a PCB hitting the bin...)
M.
--
Jazz is not dead. It just smells funny...
Powered by blists - more mailing lists