lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5db50a8e5b251714cebe0a719ee9dc73@kernel.org>
Date:   Tue, 01 Sep 2020 16:37:42 +0100
From:   Marc Zyngier <maz@...nel.org>
To:     Samuel Dionne-Riel <samuel@...nne-riel.com>,
        Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Rob Herring <robh@...nel.org>
Cc:     devicetree@...r.kernel.org, Frank Rowand <frowand.list@...il.com>,
        linux-kernel@...r.kernel.org
Subject: Re: Boot failure on gru-scarlet-inx with 5.9-rc2

On 2020-09-01 04:45, Samuel Dionne-Riel wrote:
> On Mon, 31 Aug 2020 10:27:37 +0100
> Marc Zyngier <maz@...nel.org> wrote:
>> 
>> Ah, so actually anything that *enables pcie* kills your system.
>> Great investigative work!
>> 
>> >
>> > And backed by a further bisection with this that points to
>> > d84c572de1a360501d2e439ac632126f5facf59d being the actual change
>> > that causes the tablet to fail to boot, as long as the pcie0 node is
>> > identified as pci properly.
>> >
>> > I am unsure if I should add as a Cc everyone involved in that change
>> > set, though the author (coincidentally) is already in the original
>> > list of recipients.
>> 
>> I've deliberately moved Rob from Cc to To... ;-)
> 
> Thanks, I don't actually know who to write to exactly.

Given that this is a PCI regression, I guess the PCI maintainers
are the likely victims. Adding Bjorn and Lorenzo to the list in
addition to Rob.

You can find the relevant people by looking at the MAINTAINERS
file.

>> > Any additional thoughts from this additional information?
>> 
>> What you could do is to start looking at which of the
>> pci_is_root_bus() changes breaks PCIe on this system.  The fact that
>> it breaks on your system and not on mine is a bit puzzling.
> 
> Let me show you, on top of v5.9-rc3 I can successfully boot using this
> partial revert / adaptation of d84c572d. In addition, it also allows
> the Wi-Fi to work again, compared to how it didn't in 5.9-rc1 or
> 5.9-rc[23] with the dumb revert of your fix.
> 
> So, if we number each pci_is_root_bus by order appearance, it is only
> the second use, in rockchip_pcie_valid_device, which seem to cause
> scarlet not to boot.
> 
> The patch (not actually a patch submission) reverts only that instance
> of pci_is_root_bus, while also doing some leg work to put back some
> functionally equivalent code that was refactored away since.
> 
> If there's anything else you want me to try, don't hesitate.
> 
> ---
>  drivers/pci/controller/pcie-rockchip-host.c | 8 +++++++-
>  drivers/pci/controller/pcie-rockchip.h      | 1 +
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/pcie-rockchip-host.c
> b/drivers/pci/controller/pcie-rockchip-host.c
> index 0bb2fb3e8a0b..5a27fa833fbd 100644
> --- a/drivers/pci/controller/pcie-rockchip-host.c
> +++ b/drivers/pci/controller/pcie-rockchip-host.c
> @@ -79,7 +79,7 @@ static int rockchip_pcie_valid_device(struct
> rockchip_pcie *rockchip,
>  	 * do not read more than one device on the bus directly attached
>  	 * to RC's downstream side.
>  	 */
> -	if (pci_is_root_bus(bus->parent) && dev > 0)
> +	if (bus->primary == rockchip->root_bus_nr && dev > 0)
>  		return 0;
> 
>  	return 1;
> @@ -944,6 +944,7 @@ static int rockchip_pcie_probe(struct 
> platform_device *pdev)
>  	struct rockchip_pcie *rockchip;
>  	struct device *dev = &pdev->dev;
>  	struct pci_host_bridge *bridge;
> +	struct resource *bus_res;
>  	int err;
> 
>  	if (!dev->of_node)
> @@ -983,6 +984,11 @@ static int rockchip_pcie_probe(struct
> platform_device *pdev)
>  	if (err < 0)
>  		goto err_deinit_port;
> 
> +	/* HACK; ~equiv to last param of pci_parse_request_of_pci_ranges */
> +	bus_res = (resource_list_first_type(&bridge->windows, 
> IORESOURCE_MEM))->res;
> +
> +	rockchip->root_bus_nr = bus_res->start;
> +
>  	err = rockchip_pcie_cfg_atu(rockchip);
>  	if (err)
>  		goto err_remove_irq_domain;
> diff --git a/drivers/pci/controller/pcie-rockchip.h
> b/drivers/pci/controller/pcie-rockchip.h
> index c7d0178fc8c2..0952fec7e34d 100644
> --- a/drivers/pci/controller/pcie-rockchip.h
> +++ b/drivers/pci/controller/pcie-rockchip.h
> @@ -298,6 +298,7 @@ struct rockchip_pcie {
>  	struct	gpio_desc *ep_gpio;
>  	u32	lanes;
>  	u8      lanes_map;
> +	u8      root_bus_nr;
>  	int	link_gen;
>  	struct	device *dev;
>  	struct	irq_domain *irq_domain;
> --
> 2.25.4
> 
> 
> Thanks again!

Hmmm. It seems that the original commit (d84c572d) considered that
root_bus_nr was always zero, while it may not have been.

Rob, Lorenzo: do you guys have any idea what is going on here?

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ