lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+-6iNyK+A38AWm_j61UdVth2nTSpp2CH6fZX+8oKif3dUdifQ@mail.gmail.com>
Date: Thu, 7 Aug 2025 10:40:20 -0400
From: Jim Quinlan <james.quinlan@...adcom.com>
To: Manivannan Sadhasivam <mani@...nel.org>
Cc: Florian Fainelli <florian.fainelli@...adcom.com>, Bjorn Helgaas <helgaas@...nel.org>, 
	linux-pci@...r.kernel.org, Nicolas Saenz Julienne <nsaenz@...nel.org>, 
	Bjorn Helgaas <bhelgaas@...gle.com>, Lorenzo Pieralisi <lorenzo.pieralisi@....com>, 
	Cyril Brulebois <kibi@...ian.org>, bcm-kernel-feedback-list@...adcom.com, 
	jim2101024@...il.com, Lorenzo Pieralisi <lpieralisi@...nel.org>, 
	Krzysztof Wilczyński <kwilczynski@...nel.org>, 
	Rob Herring <robh@...nel.org>, 
	"moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE" <linux-rpi-kernel@...ts.infradead.org>, 
	"moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE" <linux-arm-kernel@...ts.infradead.org>, 
	open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] PCI: brcmstb: Add panic/die handler to driver

On Thu, Aug 7, 2025 at 1:26 AM Manivannan Sadhasivam <mani@...nel.org> wrote:
>
> On Wed, Aug 06, 2025 at 01:41:35PM GMT, Florian Fainelli wrote:
> > On 8/6/25 11:50, Bjorn Helgaas wrote:
> > > > I'm not sure I understand the "racy" comment.  If the PCIe bridge is
> > > > off, we do not read the PCIe error registers.  In this case, PCIe is
> > > > probably not the cause of the panic.   In the rare case the PCIe
> > > > bridge is off  and it was the PCIe that caused the panic, nothing
> > > > gets reported, and this is where we are without this commit.
> > > > Perhaps this is what you mean by "mostly-works".  But this is the
> > > > best that can be done with SW given our HW.
> > >
> > > Right, my fault.  The error report registers don't look like standard
> > > PCIe things, so I suppose they are on the host side, not the PCIe
> > > side, so they're probably guaranteed to be accessible and non-racy
> > > unless the bridge is in reset.
> >
> > To expand upon that part, the situation that I ran in we had the PCIe link
> > down and therefore clock gated the PCIe root complex hardware to conserve
> > power. Eventually I did hit a voluntary panic, and since all panic notifiers
> > registered are invoked in succession, the one registered for the PCIe RC was
> > invoked as well and accessing clock gated registers would not work and
> > trigger another fault which would be confusing and mingle with the panic I
> > was trying to debug initially. Hence this check, and a clock gated PCIe RC
> > would not be logging any errors anyway.
>
> May I ask how you are recovering from link down? Can the driver detect link down
> using any platform IRQ?

We do have link up/down interrupts on most of our SoCs but we once
implemented a handler and the interrupts were unreliable.  We informed
HW but I do not think they implemented any changes.  We will try again
at some point to ascertain the extent of the issue.

AFAICT such a handler is not a panacea.  Having a link-down handler
may be able to immediately prevent panics for config space accesses by
intercepting them but not incoming memory accesses from the host or
endpoint device.

Regards,
Jim Quinlan
Broadcom STB/CM
>
> - Mani
>
> --
> மணிவண்ணன் சதாசிவம்

Download attachment "smime.p7s" of type "application/pkcs7-signature" (4197 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ