[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <251bd696-9029-ec5a-8b0c-da78a0c8b2eb@gmail.com>
Date:   Fri, 9 Jul 2021 21:27:51 +0200
From:   Maximilian Luz <luzmaximilian@...il.com>
To:     Pali Rohár <pali@...nel.org>
Cc:     Jonas Dreßler <verdre@...d.nl>,
        Amitkumar Karwar <amitkarwar@...il.com>,
        Ganapathi Bhat <ganapathi.bhat@....com>,
        Xinming Hu <huxinming820@...il.com>,
        Kalle Valo <kvalo@...eaurora.org>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Tsuchiya Yuto <kitakar@...il.com>,
        linux-wireless@...r.kernel.org, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>
Subject: Re: [PATCH v2 2/2] mwifiex: pcie: add reset_d3cold quirk for Surface
 gen4+ devices
On 7/9/21 8:44 PM, Pali Rohár wrote:
[...]
>> My (very) quick attempt ('echo 1 > /sys/bus/pci/.../reset) at
>> reproducing this didn't work, so I think at very least a network
>> connection needs to be active.
> 
> This is doing PCIe function level reset. Maybe you can get more luck
> with PCIe Hot Reset. See following link how to trigger PCIe Hot Reset
> from userspace: https://alexforencich.com/wiki/en/pcie/hot-reset-linux
Thanks for that link! That does indeed do something which breaks the
adapter. Running the script produces
   [  178.388414] mwifiex_pcie 0000:01:00.0: PREP_CMD: card is removed
   [  178.389128] mwifiex_pcie 0000:01:00.0: PREP_CMD: card is removed
   [  178.461365] mwifiex_pcie 0000:01:00.0: performing cancel_work_sync()...
   [  178.461373] mwifiex_pcie 0000:01:00.0: cancel_work_sync() done
   [  178.984106] pci 0000:01:00.0: [11ab:2b38] type 00 class 0x020000
   [  178.984161] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x000fffff 64bit pref]
   [  178.984193] pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x000fffff 64bit pref]
   [  178.984430] pci 0000:01:00.0: supports D1 D2
   [  178.984434] pci 0000:01:00.0: PME# supported from D0 D1 D3hot D3cold
   [  178.984871] pcieport 0000:00:1c.0: ASPM: current common clock configuration is inconsistent, reconfiguring
   [  179.297919] pci 0000:01:00.0: BAR 0: assigned [mem 0xd4400000-0xd44fffff 64bit pref]
   [  179.297961] pci 0000:01:00.0: BAR 2: assigned [mem 0xd4500000-0xd45fffff 64bit pref]
   [  179.298316] mwifiex_pcie 0000:01:00.0: enabling device (0000 -> 0002)
   [  179.298752] mwifiex_pcie: PCI memory map Virt0: 00000000c4593df1 PCI memory map Virt2: 0000000039d67daf
   [  179.300522] mwifiex_pcie 0000:01:00.0: WLAN read winner status failed!
   [  179.300552] mwifiex_pcie 0000:01:00.0: info: _mwifiex_fw_dpc: unregister device
   [  179.300622] mwifiex_pcie 0000:01:00.0: Read register failed
   [  179.300912] mwifiex_pcie 0000:01:00.0: performing cancel_work_sync()...
   [  179.300928] mwifiex_pcie 0000:01:00.0: cancel_work_sync() done
after which the card is unusable (there is no WiFi interface availabel
any more). Reloading the driver module doesn't help and produces
   [  376.906833] mwifiex_pcie: PCI memory map Virt0: 0000000025149d28 PCI memory map Virt2: 00000000c4593df1
   [  376.907278] mwifiex_pcie 0000:01:00.0: WLAN read winner status failed!
   [  376.907281] mwifiex_pcie 0000:01:00.0: info: _mwifiex_fw_dpc: unregister device
   [  376.907293] mwifiex_pcie 0000:01:00.0: Read register failed
   [  376.907404] mwifiex_pcie 0000:01:00.0: performing cancel_work_sync()...
   [  376.907406] mwifiex_pcie 0000:01:00.0: cancel_work_sync() done
again. Performing a function level reset produces
   [  402.489572] mwifiex_pcie 0000:01:00.0: mwifiex_pcie_reset_prepare: adapter structure is not valid
   [  403.514219] mwifiex_pcie 0000:01:00.0: mwifiex_pcie_reset_done: adapter structure is not valid
and doesn't help either.
Running the same command on a kernel with (among other) this patch
unfortunately also breaks the adapter in the same way. As far as I can
tell though, it doesn't run through the reset code added by this patch
(as indicated by the log message when performing FLR), which I assume
in a non-forced scenario, e.g. firmware issues (which IIRC is why this
patch exists), it would?
>> Unfortunately I can't test that with a
>> network connection (and without compiling a custom kernel for which I
>> don't have the time right now) because there's currently another bug
>> deadlocking on device removal if there's an active connection during
>> removal (which also seems to trigger on reset). That one ill be fixed
>> by
>>
>>    https://lore.kernel.org/linux-wireless/20210515024227.2159311-1-briannorris@chromium.org/
>>
>> Jonas might know more.
[...]
Powered by blists - more mailing lists
 
