[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d02e0b72-5fb3-dd47-468c-08b86db07a9a@arm.com>
Date: Tue, 14 Apr 2020 13:28:05 +0100
From: Robin Murphy <robin.murphy@....com>
To: Soeren Moch <smoch@....de>, Shawn Lin <shawn.lin@...k-chips.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
Andrew Murray <amurray@...goodpenguin.co.uk>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Heiko Stuebner <heiko@...ech.de>,
linux-rockchip@...ts.infradead.org, linux-pci@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [BUG] PCI: rockchip: rk3399: pcie switch support
On 2020-04-14 12:35 pm, Soeren Moch wrote:
> On 06.04.20 19:12, Soeren Moch wrote:
>> On 06.04.20 14:52, Robin Murphy wrote:
>>> On 2020-04-04 7:41 pm, Soeren Moch wrote:
>>>> I want to use a PCIe switch on a RK3399 based RockPro64 V2.1 board.
>>>> "Normal" PCIe cards work (mostly) just fine on this board. The PCIe
>>>> switches (I tried Pericom and ASMedia based switches) also work fine on
>>>> other boards. The RK3399 PCIe controller with pcie_rockchip_host driver
>>>> also recognises the switch, but fails to initialize the buses behind the
>>>> bridge properly, see syslog from linux-5.6.0.
>>>>
>>>> Any ideas what I do wrong, or any suggestions what I can test here?
>>> See the thread here:
>>>
>>> https://lore.kernel.org/linux-pci/CAMdYzYoTwjKz4EN8PtD5pZfu3+SX+68JL+dfvmCrSnLL=K6Few@mail.gmail.com/
>>>
>> Thanks Robin!
>>
>> I also found out in the meantime that device enumeration fails in this
>> fatal way when probing non-existent devices. So if I hack my complete
>> bus topology into rockchip_pcie_valid_device, then all existing devices
>> come up properly. Of course this is not how PCIe should work.
>>> The conclusion there seems to be that the RK3399 root complex just
>>> doesn't handle certain types of response in a sensible manner, and
>>> there's not much that can reasonably be done to change that.
>> Hm, at least there is the promising suggestion to take over the SError
>> handler, maybe in ATF, as workaround.
> Unfortunately it seems to be not that easy. Only when PCIe device
> probing runs on one of the Cortex-A72 cores of rk3399 we see the SError.
> When probing runs on one of the A53 cores, we get a synchronous external
> abort instead.
>
> Is this expected to see different error types on big.LITTLE systems? Or
> is this another special property of the rk3399 pcie controller?
As far as I'm aware, the CPU microarchitecture is indeed one of the
factors in whether it takes a given external abort synchronously or
asynchronously, so yes, I'd say that probably is expected. I wouldn't
necessarily even rely on a single microarchitecture only behaving one
way, since in principle it's possible that surrounding instructions
might affect whether the core still has enough context left to take the
exception synchronously or not at the point the abort does come back.
In general external aborts are a "should never happen" kind of thing, so
they're not necessarily expected to be recoverable (I think the RAS
extensions might add a more robustness in terms of reporting, but aren't
relevant here either way).
At this point I'm starting to wonder whether it might be possible to do
something similar to the Arm N1SDP workaround using the Cortex-M0,
albeit with the complication that probing would realistically have to be
explicitly invoked from the Linux driver due to clocks and external
regulators... :/
Robin.
Powered by blists - more mailing lists