lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMRc=McmTUPqhF9uTdxBttm9RUxLgd68uanbxAVt-jbHe27h2A@mail.gmail.com>
Date: Mon, 16 Dec 2024 14:36:24 +0100
From: Bartosz Golaszewski <brgl@...ev.pl>
To: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
Cc: Stephan Gerhold <stephan.gerhold@...aro.org>, Bjorn Helgaas <bhelgaas@...gle.com>, 
	linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
Subject: Re: [RFT PATCH] Revert "power: sequencing: request the WLAN enable
 GPIO as-is"

On Mon, Dec 16, 2024 at 2:24 PM Manivannan Sadhasivam
<manivannan.sadhasivam@...aro.org> wrote:
>
> On Mon, Dec 16, 2024 at 11:50:20AM +0100, Stephan Gerhold wrote:
> > On Mon, Dec 16, 2024 at 12:35:54PM +0530, Manivannan Sadhasivam wrote:
> > > On Fri, Dec 13, 2024 at 07:19:34PM +0100, Stephan Gerhold wrote:
> > > > On Tue, Dec 03, 2024 at 03:12:51PM +0100, Bartosz Golaszewski wrote:
> > > > > From: Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
> > > > >
> > > > > This reverts commit a9aaf1ff88a8cb99a1335c9eb76de637f0cf8c10.
> > > > >
> > > > > With the changes recently merged into the PCI/pwrctrl/ we now have
> > > > > correct ordering between the pwrseq provider and the PCI-pwrctrl
> > > > > consumers. With that, the pwrseq WCN driver no longer needs to leave the
> > > > > GPIO state as-is and we can remove the workaround.
> > > > >
> > > >
> > > > Should probably revert commit d8b762070c3f ("power: sequencing:
> > > > qcom-wcn: set the wlan-enable GPIO to output") as well?
> > > >
> > > > > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
> > > > > ---
> > > > >  drivers/power/sequencing/pwrseq-qcom-wcn.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/power/sequencing/pwrseq-qcom-wcn.c b/drivers/power/sequencing/pwrseq-qcom-wcn.c
> > > > > index 682a9beac69eb..bb8c47280b7bc 100644
> > > > > --- a/drivers/power/sequencing/pwrseq-qcom-wcn.c
> > > > > +++ b/drivers/power/sequencing/pwrseq-qcom-wcn.c
> > > > > @@ -379,7 +379,7 @@ static int pwrseq_qcom_wcn_probe(struct platform_device *pdev)
> > > > >                                      "Failed to get the Bluetooth enable GPIO\n");
> > > > >
> > > > >         ctx->wlan_gpio = devm_gpiod_get_optional(dev, "wlan-enable",
> > > > > -                                                GPIOD_ASIS);
> > > > > +                                                GPIOD_OUT_LOW);
> > > > >         if (IS_ERR(ctx->wlan_gpio))
> > > > >                 return dev_err_probe(dev, PTR_ERR(ctx->wlan_gpio),
> > > > >                                      "Failed to get the WLAN enable GPIO\n");
> > > > > --
> > > > > 2.30.2
> > > > >
> > > >
> > > > I'm not sure why but applying this patch brings back the error I had
> > > > before. It does seem like setting wlan-enable GPIO happens early enough,
> > > > but maybe some timing is still wrong.
> > > >
> > >
> > > There should be no room for timing issue now :/
> > >
> > > > [   17.132161] <gpiod_set_value_cansleep(ctx->wlan_gpio, 1);>
> > > > [   17.480619] ath12k_pci 0004:01:00.0: of_irq_parse_pci: failed with rc=134
> > > > [   17.491997] ath12k_pci 0004:01:00.0: pci device id mismatch: 0xffff 0x1107
> > > > [   17.492000] ath12k_pci 0004:01:00.0: failed to claim device: -5
> > > > [   17.492075] ath12k_pci 0004:01:00.0: probe with driver ath12k_pci failed with error -5
> > > >
> > >
> > > Are you sure that this is the same error that you noticed before?
> > >
> >
> > Yes, "pci device id mismatch: 0xffff 0x1107" is definitely the same
> > error I saw before.
> >
> > > > Any ideas/suggestions?
> > > >
> > >
> > > Can you verify that the pwrctrl driver's probe is completed *before* ath12k
> > > driver starting to probe by adding the debug prints in both drivers?
> > >
> >
> > I tried booting with "modprobe.blacklist=ath12k" and manually loaded the
> > module several seconds after the boot completed. Error is unchanged:
> >
> > [   16.628165] <gpiod_set_value_cansleep(ctx->wlan_gpio, 1);>
> > [   55.065794] ath12k_pci 0004:01:00.0: of_irq_parse_pci: failed with rc=134
> > [   55.073354] ath12k_pci 0004:01:00.0: pci device id mismatch: 0xffff 0x1107
> > [   55.080457] ath12k_pci 0004:01:00.0: failed to claim device: -5
> > [   55.088977] ath12k_pci 0004:01:00.0: probe with driver ath12k_pci failed with error -5
> >
> > I played around a bit more and it looks like the problem is that the PCI
> > device is still being enumerated during startup, before the pwrseq
> > driver is loaded. Not with the ath12k driver, but the internal PCI
> > subsystem state. And then the PCI link dies briefly when the pwrseq
> > driver loads.
> >
>
> Ok, this seems to be the cause. There is a known issue with Qcom PCIe
> controllers where if the device is powered off abrubtly (without controller
> driver noticing) the PCIe link moves to link down state. Then we need to
> bring the link back by reinitializing the controller. I have it in my todo list
> but no one noticed this issue unless they tried powering down and powering up
> the device (which is rare except on hot pluggable slots).
>
> > If I add some hacks to the DT to force the wlan-enable GPIO low before
> > the entire PCI _controller_ is probed, then it works correctly:
> >
> > [   16.607359] <gpiod_set_value_cansleep(ctx->wlan_gpio, 1);>
> > [   16.668533] pci 0004:01:00.0: [17cb:1107] type 00 class 0x028000 PCIe Endpoint
> > [   16.668606] pci 0004:01:00.0: BAR 0 [mem 0x00000000-0x001fffff 64bit]
> > [   16.669055] pci 0004:01:00.0: PME# supported from D0 D3hot D3cold
> > [   16.675546] pcieport 0004:00:00.0: bridge window [mem 0x7c400000-0x7c5fffff]: assigned
> > [   16.675555] pci 0004:01:00.0: BAR 0 [mem 0x7c400000-0x7c5fffff 64bit]: assigned
> > [   16.862083] ath12k_pci 0004:01:00.0: BAR 0 [mem 0x7c400000-0x7c5fffff 64bit]: assigned
> > [   16.870358] ath12k_pci 0004:01:00.0: enabling device (0000 -> 0002)
> > [   16.888792] ath12k_pci 0004:01:00.0: MSI vectors: 16
> > [   16.893954] ath12k_pci 0004:01:00.0: Hardware name: wcn7850 hw2.0
> >
> > Note the pci messages after enabling the GPIO, before the first
> > ath12k_pci messages. Without the hack, those appear already before the
> > pwrseq driver is being loaded (during initramfs).
> >
> > [    5.888688] pci 0004:01:00.0: [17cb:1107] type 00 class 0x028000 PCIe Endpoint
> > [    5.888758] pci 0004:01:00.0: BAR 0 [mem 0x00000000-0x001fffff 64bit]
> > [    5.889207] pci 0004:01:00.0: PME# supported from D0 D3hot D3cold
> > [    5.902692] pci 0004:00:00.0: bridge window [mem 0x7c400000-0x7c5fffff]: assigned
> > [    5.910311] pci 0004:01:00.0: BAR 0 [mem 0x7c400000-0x7c5fffff 64bit]: assigned
> > ...
> > [   21.227565] <gpiod_set_value_cansleep(ctx->wlan_gpio, 1);>
> > [   21.305496] ath12k_pci 0004:01:00.0: of_irq_parse_pci: failed with rc=134
> > [   21.318382] ath12k_pci 0004:01:00.0: pci device id mismatch: 0xffff 0x1107
> > [   21.338489] ath12k_pci 0004:01:00.0: failed to claim device: -5
> > [   21.338555] ath12k_pci 0004:01:00.0: probe with driver ath12k_pci failed with error -5
> >
> > Can we skip scanning the PCI bus until the power sequencing is done?
> >
>
> This won't help (but a good idea anyway that I'll implement). See below...
>
> > The hack I used (see below) works, but is a bit odd since it requires
> > assigning the wcn7850-pmu pinctrl to the PCI bus in the DT. Otherwise
> > the GPIO is not forced low early enough.
> >
>
> Your hack is making sure that the default state of the GPIO is not changed at
> all after initializing the controller. So even if the pwrctrl driver probes
> later, it will try to enable the module by doing,
> 'gpiod_set_value_cansleep(ctx->wlan_gpio, 1)', which would do nothing to the
> device state.
>
> So the issue is not with the pwrctrl driver but with the controller
> implementation. Ideally, once the device is removed, the PCIe link should move
> to Detect state and then to Polling state once the receiver is detected on the
> lanes. But the DWC and Qcom glue has other logics that prevents the controller
> from doing so.
>
> So until the link down handling is implemented in the controller driver, we need
> to carry this hack that preserves the GPIO state.
>

Thanks for the explanation Mani. Regarding this patch: I suggest we
keep it for now but maybe I'll add a comment saying why it's still
necessary?

Bart

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ