[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <985049b8-bad7-6f18-c94f-368059dd6f95@gmail.com>
Date: Wed, 1 Sep 2021 18:51:21 +0200
From: Heiner Kallweit <hkallweit1@...il.com>
To: Pali Rohár <pali@...nel.org>,
Jonas Dreßler <verdre@...d.nl>
Cc: Andy Shevchenko <andy.shevchenko@...il.com>,
Amitkumar Karwar <amitkarwar@...il.com>,
Ganapathi Bhat <ganapathi017@...il.com>,
Xinming Hu <huxinming820@...il.com>,
Kalle Valo <kvalo@...eaurora.org>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Tsuchiya Yuto <kitakar@...il.com>,
"open list:TI WILINK WIRELES..." <linux-wireless@...r.kernel.org>,
netdev <netdev@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-pci <linux-pci@...r.kernel.org>,
Maximilian Luz <luzmaximilian@...il.com>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Bjorn Helgaas <bhelgaas@...gle.com>
Subject: Re: [PATCH 1/2] mwifiex: Use non-posted PCI register writes
On 01.09.2021 17:51, Pali Rohár wrote:
> On Wednesday 01 September 2021 16:01:54 Jonas Dreßler wrote:
>> On 8/30/21 2:49 PM, Andy Shevchenko wrote:
>>> On Mon, Aug 30, 2021 at 3:38 PM Jonas Dreßler <verdre@...d.nl> wrote:
>>>>
>>>> On the 88W8897 card it's very important the TX ring write pointer is
>>>> updated correctly to its new value before setting the TX ready
>>>> interrupt, otherwise the firmware appears to crash (probably because
>>>> it's trying to DMA-read from the wrong place).
>>>>
This sounds somehow like the typical case where you write DMA descriptors
and then ring the doorbell. This normally requires a dma_wmb().
Maybe something like that is missing here?
Reading back all register writes may cause a certain performance impact,
and if it can be avoided we should try to avoid it.
>>>> Since PCI uses "posted writes" when writing to a register, it's not
>>>> guaranteed that a write will happen immediately. That means the pointer
>>>> might be outdated when setting the TX ready interrupt, leading to
>>>> firmware crashes especially when ASPM L1 and L1 substates are enabled
>>>> (because of the higher link latency, the write will probably take
>>>> longer).
>>>>
>>>> So fix those firmware crashes by always forcing non-posted writes. We do
>>>> that by simply reading back the register after writing it, just as a lot
>>>> of other drivers do.
>>>>
>>>> There are two reproducers that are fixed with this patch:
>>>>
>>>> 1) During rx/tx traffic and with ASPM L1 substates enabled (the enabled
>>>> substates are platform dependent), the firmware crashes and eventually a
>>>> command timeout appears in the logs. That crash is fixed by using a
>>>> non-posted write in mwifiex_pcie_send_data().
>>>>
>>>> 2) When sending lots of commands to the card, waking it up from sleep in
>>>> very quick intervals, the firmware eventually crashes. That crash
>>>> appears to be fixed by some other non-posted write included here.
>>>
>>> Thanks for all this work!
>>>
>>> Nevertheless, do we have any commits that may be a good candidate to
>>> be in the Fixes tag here?
>>>
>>
>> I don't think there's any commit we could point to, given that the bug is
>> probably somewhere in the firmware code.
>
> Then please add Cc: stable@...r.kernel.org tag into commit message. Such
> bugfix is a good candidate for backporting into stable releases.
>
>>>> Signed-off-by: Jonas Dreßler <verdre@...d.nl>
Powered by blists - more mailing lists