lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL_JsqJLTkDm_ZbFWSKwKvVAh0KpxiS9y6LEwmhQ-kejTcLq7A@mail.gmail.com>
Date:   Tue, 8 Feb 2022 08:34:45 -0600
From:   Rob Herring <robh@...nel.org>
To:     dann frazier <dann.frazier@...onical.com>
Cc:     Toan Le <toan@...amperecomputing.com>,
        Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
        Krzysztof Wilczyński <kw@...ux.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Andrew Murray <amurray@...goodpenguin.co.uk>,
        Stéphane Graber <stgraber@...ntu.com>,
        stable <stable@...r.kernel.org>, PCI <linux-pci@...r.kernel.org>,
        linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] PCI: xgene: Fix IB window setup

On Mon, Feb 7, 2022 at 7:19 PM dann frazier <dann.frazier@...onical.com> wrote:
>
> On Mon, Feb 07, 2022 at 10:09:31AM -0600, Rob Herring wrote:
> > On Sat, Feb 5, 2022 at 3:13 PM dann frazier <dann.frazier@...onical.com> wrote:
> > >
> > > On Sat, Feb 5, 2022 at 9:05 AM Rob Herring <robh@...nel.org> wrote:
> > > >
> > > > On Fri, Feb 4, 2022 at 5:01 PM dann frazier <dann.frazier@...onical.com> wrote:
> > > > >
> > > > > On Mon, Nov 29, 2021 at 11:36:37AM -0600, Rob Herring wrote:
> > > > > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup")
> > > > > > broke PCI support on XGene. The cause is the IB resources are now sorted
> > > > > > in address order instead of being in DT dma-ranges order. The result is
> > > > > > which inbound registers are used for each region are swapped. I don't
> > > > > > know the details about this h/w, but it appears that IB region 0
> > > > > > registers can't handle a size greater than 4GB. In any case, limiting
> > > > > > the size for region 0 is enough to get back to the original assignment
> > > > > > of dma-ranges to regions.
> > > > >
> > > > > hey Rob!
> > > > >
> > > > > I've been seeing a panic on HP Moonshoot m400 cartridges (X-Gene1) -
> > > > > only during network installs - that I also bisected down to commit
> > > > > 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup"). I was
> > > > > hoping that this patch that fixed the issue on Stéphane's X-Gene2
> > > > > system would also fix my issue, but no luck. In fact, it seems to just
> > > > > makes it fail differently. Reverting both patches is required to get a
> > > > > v5.17-rc kernel to boot.
> > > > >
> > > > > I've collected the following logs - let me know if anything else would
> > > > > be useful.
> > > > >
> > > > > 1) v5.17-rc2+ (unmodified):
> > > > >    http://dannf.org/bugs/m400-no-reverts.log
> > > > >    Note that the mlx4 driver fails initialization.
> > > > >
> > > > > 2) v5.17-rc2+, w/o the commit that fixed Stéphane's system:
> > > > >    http://dannf.org/bugs/m400-xgene2-fix-reverted.log
> > > > >    Note the mlx4 MSI-X timeout, and later panic.
> > > > >
> > > > > 3) v5.17-rc2+, w/ both commits reverted (works)
> > > > >    http://dannf.org/bugs/m400-both-reverted.log
> > > >
> > > > The ranges and dma-ranges addresses don't appear to match up with any
> > > > upstream dts files. Can you send me the DT?
> > >
> > > Sure: http://dannf.org/bugs/fdt
> >
> > The first fix certainly is a problem. It's going to need something
> > besides size to key off of (originally it was dependent on order of
> > dma-ranges entries).
> >
> > The 2nd issue is the 'dma-ranges' has a second entry that is now ignored:
> >
> > dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00>, <0x00
> > 0x79000000 0x00 0x79000000 0x00 0x800000>;
> >
> > Based on the flags (3rd addr cell: 0x0), we have an inbound config
> > space which the kernel now ignores because inbound config space
> > accesses make no sense. But clearly some setup is needed. Upstream, in
> > contrast, sets up a memory range that includes this region, so the
> > setup does happen:
> >
> > <0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000>
> >
> > Minimally, I suspect it will work if you change dma-ranges 2nd entry to:
> >
> > <0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>
>
> Thanks for looking into this Rob. I tried to test that theory, but it
> didn't seem to work. This is what I tried:
>
> --- m400.dts    2022-02-07 20:16:44.840475323 +0000
> +++ m400.dts.dmaonly    2022-02-08 00:17:54.097132000 +0000
> @@ -446,7 +446,7 @@
>                         reg = <0x00 0x1f2b0000 0x00 0x10000 0xe0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>;
>                         reg-names = "csr\0cfg\0msi_gen\0msi_term";
>                         ranges = <0x1000000 0x00 0x00 0xe0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xe1 0x30000000 0x00 0x80000000>;
> -                       dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>;
> +                       dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>;
>                         ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>;
>                         ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>;
>                         interrupts = <0x00 0x10 0x04>;
> @@ -471,7 +471,7 @@
>                         reg = <0x00 0x1f2c0000 0x00 0x10000 0xd0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>;
>                         reg-names = "csr\0cfg\0msi_gen\0msi_term";
>                         ranges = <0x1000000 0x00 0x00 0xd0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xd1 0x30000000 0x00 0x80000000>;
> -                       dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>;
> +                       dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>;
>                         ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>;
>                         ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>;
>                         interrupts = <0x00 0x10 0x04>;
> @@ -496,7 +496,7 @@
>                         reg = <0x00 0x1f2d0000 0x00 0x10000 0x90 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>;
>                         reg-names = "csr\0cfg\0msi_gen\0msi_term";
>                         ranges = <0x1000000 0x00 0x00 0x90 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0x91 0x30000000 0x00 0x80000000>;
> -                       dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>;
> +                       dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>;
>                         ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>;
>                         ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>;
>                         interrupts = <0x00 0x10 0x04>;
> @@ -522,7 +522,7 @@
>                         reg = <0x00 0x1f500000 0x00 0x10000 0xa0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>;
>                         reg-names = "csr\0cfg\0msi_gen\0msi_term";
>                         ranges = <0x2000000 0x00 0x30000000 0xa1 0x30000000 0x00 0x80000000>;
> -                       dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>;
> +                       dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>;
>                         ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>;
>                         ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>;
>                         interrupts = <0x00 0x10 0x04>;
> @@ -547,7 +547,7 @@
>                         reg = <0x00 0x1f510000 0x00 0x10000 0xc0 0xd0000000 0x00 0x200000 0x00 0x79e00000 0x00 0x2000000 0x00 0x79000000 0x00 0x800000>;
>                         reg-names = "csr\0cfg\0msi_gen\0msi_term";
>                         ranges = <0x1000000 0x00 0x00 0xc0 0x10000000 0x00 0x10000 0x2000000 0x00 0x30000000 0xc1 0x30000000 0x00 0x80000000>;
> -                       dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>;
> +                       dma-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x42000000 0x79000000 0x00 0x79000000 0x00 0x800000>;
>                         ib-ranges = <0x42000000 0x40 0x00 0x40 0x00 0x40 0x00 0x00 0x00 0x79000000 0x00 0x79000000 0x00 0x800000>;
>                         ib-ranges-ep = <0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x00 0x00 0x00 0x00 0x400000 0x2000000 0x00 0x79000000 0x00 0x79000000 0x00 0x100000>;
>                         interrupts = <0x00 0x10 0x04>;
>
> And that failed to boot with a 5.17-rc3. Since dma-ranges was
> previously identical to ib-ranges, I also tried making the same change
> to ib-ranges, but with no success.

Failed to boot at all or just PCIe still didn't work causing boot to
eventually fail? 'ib-ranges' is unknown to the kernel, so the firmware
is using it somehow?

You also need to revert the first fix for PCIe to work.


> > While we shouldn't break existing DTs, the moonshot DT doesn't use
> > what's documented upstream. There are multiple differences compared to
> > what's documented. Is upstream supposed to support upstream DTs,
> > downstream DTs, and ACPI for XGene which is an abandoned platform with
> > only a handful of users?
>
> That's a fair question, though it's one of a policy, and I feel I'd be
> overstepping by weighing in. I suppose one option I have is to try
> and create and upstream a dts for these systems and modify our
> boot.scr to always load that over the one provided by firmware. While
> we do have some of these systems in production, they are being retired
> and replaced with newer kit over time, and it's possible we'll never
> need to upgrade them to a modern kernel.
>
>   -dann

Powered by blists - more mailing lists