lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2401016.9fHWaBTJ5E@diego>
Date: Mon, 29 Jul 2024 21:09:28 +0200
From: Heiko Stübner <heiko@...ech.de>
To: Sam Edwards <cfsworks@...il.com>, Rob Herring <robh+dt@...nel.org>,
 Jonathan Bennett <jbennett@...omsystems.biz>
Cc: linux-rockchip@...ts.infradead.org, linux-arm-kernel@...ts.infradead.org,
 linux-kernel@...r.kernel.org, devicetree@...r.kernel.org,
 Daniel Kukieła <daniel@...iela.pl>,
 Sven Rademakers <sven.rademakers@...il.com>, Joshua Riek <jjriek@...izon.net>
Subject: Re: [PATCH v2] arm64: dts: rockchip: Add PCIe pinctrls to Turing RK1

Hi Jonathan, Sam,

Am Mittwoch, 5. Juni 2024, 21:45:42 CEST schrieb Jonathan Bennett:
> On 12/8/23 11:27 AM, Sam Edwards wrote:
> > On 12/8/23 04:05, Heiko Stübner wrote:
> >> Am Freitag, 8. Dezember 2023, 07:25:10 CET schrieb Sam Edwards:
> >>> The RK3588 PCIe 3.0 controller seems to have unpredictable behavior 
> >>> when
> >>> no CLKREQ/PERST/WAKE pins are configured in the pinmux. In 
> >>> particular, it
> >>> will sometimes (varying between specific RK3588 chips, not over 
> >>> time) shut
> >>> off the DBI block, and reads to this range will instead stall
> >>> indefinitely.
> >>>
> >>> When this happens, it will prevent Linux from booting altogether. The
> >>> PCIe driver will stall the CPU core once it attempts to read the 
> >>> version
> >>> information from the DBI range.
> >>>
> >>> Fix this boot hang by adding the correct pinctrl configuration to the
> >>> PCIe 3.0 device node, which is the proper thing to do anyway. While
> >>> we're at it, also add the necessary configuration to the PCIe 2.0 node,
> >>> which may or may not fix the equivalent problem over there -- but is 
> >>> the
> >>> proper thing to do anyway. :)
> >>>
> >>> Fixes: 2806a69f3fef6 ("arm64: dts: rockchip: Add Turing RK1 SoM 
> >>> support")
> >>> Signed-off-by: Sam Edwards <CFSworks@...il.com>
> >>> ---
> >>>
> >>> Hi list,
> >>>
> >>> Compared to v1, v2 removes the `reset-gpios` properties as well -- 
> >>> this should
> >>> give control of the PCIe resets exclusively to the PCIe cores. (And 
> >>> even if the
> >>> `reset-gpios` props had no effect in v1, it'd be confusing to have 
> >>> them there.)
> >>
> >> Hmm, I'd think this could result in differing behaviour.
> >>
> >> I.e. I tried the same on a different board with a nvme drive on the 
> >> pci30x4
> >> controller. But moving the reset from the gpio-way to "just" setting the
> >> perstn pinctrl, simply hung the controller when probing the device.
> >
> > Ah, I'm guessing it died in:
> > ver = dw_pcie_readl_dbi(pci, PCIE_VERSION_NUMBER);
> >
> > If so, that's the same hang that this patch is aiming to solve. I'm 
> > starting to wonder if having muxed pins != 1 for a given signal upsets 
> > the RC(s). Is your board (in an earlier boot stage: 
> > reset/maskrom/bootloader) muxing a different perstn pin option to that 
> > controller (and Linux doesn't know to clear it away)? Then when you 
> > add the perstn pinctrl the total number of perstns muxed to the 
> > controller comes to 2, and without a reset-gpios = <...>; to take it 
> > away again, that number stays at 2 to cause the hang upon the DBI read?
> >
> > If so, that'd mean the change resolves the hang for me, because it 
> > brings the total up to 1 (from 0), but also causes the hang for you, 
> > because it brings the total up to 2 (away from 1).
> >
> >>
> >> So I guess I'd think the best way would be to split the pinctrl up 
> >> into the
> >> 3 separate functions (clkreqn, perstn, waken) so that boards can include
> >> them individually.
> >
> > Mm, maybe. Though that might be more appropriate if a board comes 
> > along that doesn't connect all of those signals to the same group of 
> > pins. I worry that attempting to solve this hang by doing that will 
> > instead just mask the real problem.
> >
> >>
> >> Nobody is using the controller pinctrl entries so far anyway :-) .
> >
> > Then it's interesting that this board requires them in order to avoid 
> > a hang on boot. I'll see if there's anything else that I can find out.
> 
> I've just finished testing the latest iteration of this patch with 
> 6.10-rc2 on my RK1 on a Turing Pi 2 carrier board. I can confirm that 
> unpatched mainline fails to boot with the same hang described here, and 
> the patch does make the board boot again.

Can you possibly test if

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=28b8d7793b8573563b3d45321376f36168d77b1e

changes anything? In 6.11-rc1 now.

The PERST# toggling happening before that patch could've caused
issues with your PCIe device.


Heiko



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ