[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8c2f274b-cf05-4fad-b9d6-fa9de1363d42@gmail.com>
Date: Tue, 26 Nov 2024 04:59:35 -0800
From: James Prestwood <prestwoj@...il.com>
To: Remi Pommarel <repk@...plefau.lt>, ath10k@...ts.infradead.org,
linux-wireless@...r.kernel.org, linux-kernel@...r.kernel.org
Cc: Kalle Valo <kvalo@...nel.org>, Jeff Johnson <jjohnson@...nel.org>,
Cedric Veilleux <veilleux.cedric@...il.com>,
Vasanthakumar Thiagarajan <quic_vthiagar@...cinc.com>
Subject: Re: [RESEND PATCH v3 0/2] Improve ath10k flush queue mechanism
On 11/26/24 4:57 AM, James Prestwood wrote:
> Hi Remi,
>
> On 11/22/24 8:48 AM, Remi Pommarel wrote:
>> It has been reported [0] that a 3-4 seconds (actually up to 5 sec) of
>> radio silence could be observed followed by the error below on ath10k
>> devices:
>>
>> ath10k_pci 0000:04:00.0: failed to flush transmit queue (skip 0
>> ar-state 1): 0
>>
>> This is due to how the TX queues are flushed in ath10k. When a STA is
>> removed, mac80211 need to flush queues [1], but because ath10k does not
>> have a lightweight .flush_sta operation, ieee80211_flush_queues() is
>> called instead effectively blocking the whole queue during the drain
>> causing this radio silence. Also because ath10k_flush() waits for all
>> queued to be emptied, not only the flushed ones it could more easily
>> take up to 5 seconds to finish making the whole situation worst.
>>
>> The first patch of this series adds a .flush_sta operation to flush only
>> specific STA traffic avoiding the need to stop whole queues and should
>> be enough in itself to fix the reported issue.
>>
>> The second patch of this series is a proposal to improve ath10k_flush so
>> that it will be less likely to timeout waiting for non related queues to
>> drain.
>>
>> The abose kernel warning could still be observed (e.g. flushing a dead
>> STA) but should be now harmless.
>>
>> [0]:
>> https://lore.kernel.org/all/CA+Xfe4FjUmzM5mvPxGbpJsF3SvSdE5_wgxvgFJ0bsdrKODVXCQ@mail.gmail.com/
>> [1]: commit 0b75a1b1e42e ("wifi: mac80211: flush queues on STA removal")
>
> I saw in the original report that it indicated it was only for AP mode
> but after seeing this and checking some of our clients I saw that this
> is also happening in station mode too. I only have clients on 6.2 and
> 6.8. I can confirm its not occurring on 6.2, but is on 6.8. I also
> tried your set of patches but did not notice any behavior difference
> with or without them. When it happens, its always just after a roam
> scan, ~4 seconds go by and we get the failure followed by a
> "Connection to AP <mac> lost". Oddly the MAC address is all zeros.
>
> Nov 25 09:09:50 iwd[16256]: src/station.c:station_start_roam() Using
> cached neighbor report for roam
> Nov 25 09:09:54 kernel: ath10k_pci 0000:02:00.0: failed to flush
> transmit queue (skip 0 ar-state 1): 0
> Nov 25 09:09:54 iwd[16256]: src/netdev.c:netdev_mlme_notify() MLME
> notification Del Station(20)
> Nov 25 09:09:54 iwd[16256]: src/netdev.c:netdev_link_notify() event 16
> on ifindex 7
> Nov 25 09:09:54 iwd[16256]: src/netdev.c:netdev_mlme_notify() MLME
> notification Deauthenticate(39)
> Nov 25 09:09:54 iwd[16256]: src/netdev.c:netdev_deauthenticate_event()
> Nov 25 09:09:54 iwd[16256]: src/netdev.c:netdev_mlme_notify() MLME
> notification Disconnect(48)
> Nov 25 09:09:54 iwd[16256]: src/netdev.c:netdev_disconnect_event()
> Nov 25 09:09:54 iwd[16256]: Received Deauthentication event, reason:
> 4, from_ap: false
> Nov 25 09:09:54 kernel: wlan0: Connection to AP 00:00:00:00:00:00 lost
>
> Other times, the above logs are preceded by this:
>
> Nov 26 00:25:25 kernel: ath10k_pci 0000:02:00.0: failed to flush sta
> txq (sta ca:55:b8:7a:91:4b skip 0 ar-state 1): 0
>
> Note, the above logs are with your patches applied. Maybe this is a
> separate issue? Or do you think its related?
Forgot to mention, this is on the QCA6174 hw 3.2
firmware ver WLAN.RM.4.4.1-00288- api 6 features wowlan,ignore-otp,mfp
crc32 bf907c7c
>
> Thanks,
>
> James
>
>>
>> V3:
>> - Initialize empty to true to fix smatch error
>>
>> V2:
>> - Add Closes tag
>> - Use atomic instead of spinlock for per sta pending frame counter
>> - Call ath10k_htt_tx_sta_dec_pending within rcu
>> - Rename pending_per_queue[] to num_pending_per_queue[]
>>
>> Remi Pommarel (2):
>> wifi: ath10k: Implement ieee80211 flush_sta callback
>> wifi: ath10k: Flush only requested txq in ath10k_flush()
>>
>> drivers/net/wireless/ath/ath10k/core.h | 2 +
>> drivers/net/wireless/ath/ath10k/htt.h | 11 +++-
>> drivers/net/wireless/ath/ath10k/htt_tx.c | 49 +++++++++++++++-
>> drivers/net/wireless/ath/ath10k/mac.c | 75 ++++++++++++++++++++----
>> drivers/net/wireless/ath/ath10k/txrx.c | 11 ++--
>> 5 files changed, 127 insertions(+), 21 deletions(-)
>>
Powered by blists - more mailing lists