lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 10 May 2023 13:35:43 +0200
From:   "Vincenzo Palazzo" <vincenzopalazzodev@...il.com>
To:     "Bjorn Helgaas" <helgaas@...nel.org>
Cc:     <linux-pci@...r.kernel.org>, <robh@...nel.org>, <heiko@...ech.de>,
        <kw@...ux.com>, <shawn.lin@...k-chips.com>,
        <linux-kernel@...r.kernel.org>, <lgirdwood@...il.com>,
        <linux-rockchip@...ts.infradead.org>, <broonie@...nel.org>,
        <bhelgaas@...gle.com>,
        <linux-kernel-mentees@...ts.linuxfoundation.org>,
        <lpieralisi@...nel.org>, <linux-arm-kernel@...ts.infradead.org>,
        "Dan Johansen" <strit@...jaro.org>
Subject: Re: [PATCH v1] drivers: pci: introduce configurable delay for
 Rockchip PCIe bus scan

> Hi Vincenzo,

Hi :)

> Thanks for raising this issue.  Let's see what we can do to address
> it.

Yeah, as I said in my cover letter, I am not happy with my solution,
but we should start somewhere to discuss it.

> > Add a configurable delay to the Rockchip PCIe driver to address
> > crashes that occur on some old devices, such as the Pine64 RockPro64.
> > 
> > This issue is affecting the ARM community, but there is no
> > upstream solution for it yet.
>
> It sounds like this happens with several endpoints, right?  And I
> assume the endpoints work fine in other non-Rockchip systems?  If
> that's the case, my guess is the problem is with the Rockchip host
> controller and how it's initialized, not with the endpoints.


Yeah, the crash is only reproducible with the Rockchip system, or better, 
the crash is reproducible only in some modern devices that use the old 
Rockchip driver mentioned in this patch.

> The only delays and timeouts I see in the driver now are in
> rockchip_pcie_host_init_port(), where it waits for link training to
> complete.  I assume the link training did completely successfully
> since you don't mention either a gen1 or gen2 timeout (although the
> gen2 message is a dev_dbg() that normally wouldn't go to the console).
>
> I don't know that the spec contains a retrain timeout value.  Several
> other drivers use 1 second, while rockchip uses 500ms (for example,
> see LINK_RETRAIN_TIMEOUT and LINK_UP_TIMEOUT).
>
> I think we need to understand the issue better before adding a DT
> property and a module parameter.  Those are hard for users to deal
> with.  If we can figure out a value that works for everybody, it would
> be better to just hard-code it in the driver and use that all the
> time.

Yeah, I see, I see. This makes sense. Is there any path that I can follow in 
order to better understand what's going on at the hardware level? In other 
words, how can I help to understand this issue better and provide a
unique solution for everybody?

Thanks for the nits in the patch, I will take a look with a fresh mind
later in the day.

Cheers!

Vincent.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ