[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51a59b41-4214-4e24-bfe8-3d8174ba1a3b@craftyguy.net>
Date: Sun, 23 Mar 2025 23:15:54 -0700
From: Clayton Craft <clayton@...ftyguy.net>
To: Johan Hovold <johan+linaro@...nel.org>, Jeff Johnson <jjohnson@...nel.org>
Cc: Miaoqing Pan <quic_miaoqing@...cinc.com>,
Steev Klimaszewski <steev@...i.org>,
Jens Glathe <jens.glathe@...schoolsolutions.biz>,
ath11k@...ts.infradead.org, linux-kernel@...r.kernel.org,
stable@...r.kernel.org
Subject: Re: [PATCH] wifi: ath11k: fix rx completion meta data corruption
On 3/21/25 07:53, Johan Hovold wrote:
> Add the missing memory barrier to make sure that the REO dest ring
> descriptor is read after the head pointer to avoid using stale data on
> weakly ordered architectures like aarch64.
>
> This may fix the ring-buffer corruption worked around by commit
> f9fff67d2d7c ("wifi: ath11k: Fix SKB corruption in REO destination
> ring") by silently discarding data, and may possibly also address user
> reported errors like:
>
> ath11k_pci 0006:01:00.0: msdu_done bit in attention is not set
>
> Tested-on: WCN6855 hw2.1 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41
>
> Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
> Cc: stable@...r.kernel.org # 5.6
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218005
> Signed-off-by: Johan Hovold <johan+linaro@...nel.org>
> ---
>
> As I reported here:
>
> https://lore.kernel.org/lkml/Z9G5zEOcTdGKm7Ei@hovoldconsulting.com/
>
> the ath11k and ath12k appear to be missing a number of memory barriers
> that are required on weakly ordered architectures like aarch64 to avoid
> memory corruption issues.
>
> Here's a fix for one more such case which people already seem to be
> hitting.
>
> Note that I've seen one "msdu_done" bit not set warning also with this
> patch so whether it helps with that at all remains to be seen. I'm CCing
> Jens and Steev that see these warnings frequently and that may be able
> to help out with testing.
>
Before this patch I was seeing this "msdu_done bit" an average of about
40 times per hour... e.g. a recent boot period of 43hrs saw 1600 of
these msgs. I've been testing this patch for about 10 hours now
connected to the same network etc, and haven't seen this "msdu_done bit"
message once. So, even if it's not completely resolving this for
everyone, it seems to be a huge improvement for me.
0006:01:00.0 Network controller: Qualcomm Technologies, Inc QCNFA765
Wireless Network Adapter (rev 01)
ath11k_pci 0006:01:00.0: chip_id 0x2 chip_family 0xb board_id 0x8c
soc_id 0x400c0210
ath11k_pci 0006:01:00.0: fw_version 0x11088c35 fw_build_timestamp
2024-04-17 08:34 fw_build_id
WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41
Tested-by: Clayton Craft <clayton@...ftyguy.net>
Powered by blists - more mailing lists