lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z-EOoIjhOXrT84gX@hovoldconsulting.com>
Date: Mon, 24 Mar 2025 08:49:52 +0100
From: Johan Hovold <johan@...nel.org>
To: Clayton Craft <clayton@...ftyguy.net>
Cc: Johan Hovold <johan+linaro@...nel.org>,
	Jeff Johnson <jjohnson@...nel.org>,
	Miaoqing Pan <quic_miaoqing@...cinc.com>,
	Steev Klimaszewski <steev@...i.org>,
	Jens Glathe <jens.glathe@...schoolsolutions.biz>,
	ath11k@...ts.infradead.org, linux-kernel@...r.kernel.org,
	stable@...r.kernel.org
Subject: Re: [PATCH] wifi: ath11k: fix rx completion meta data corruption

On Sun, Mar 23, 2025 at 11:15:54PM -0700, Clayton Craft wrote:
> On 3/21/25 07:53, Johan Hovold wrote:
> > Add the missing memory barrier to make sure that the REO dest ring
> > descriptor is read after the head pointer to avoid using stale data on
> > weakly ordered architectures like aarch64.
> > 
> > This may fix the ring-buffer corruption worked around by commit
> > f9fff67d2d7c ("wifi: ath11k: Fix SKB corruption in REO destination
> > ring") by silently discarding data, and may possibly also address user
> > reported errors like:
> > 
> > 	ath11k_pci 0006:01:00.0: msdu_done bit in attention is not set
> > 
> > Tested-on: WCN6855 hw2.1 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41
> > 
> > Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
> > Cc: stable@...r.kernel.org	# 5.6
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218005
> > Signed-off-by: Johan Hovold <johan+linaro@...nel.org>
> > ---
> > 
> > As I reported here:
> > 
> > 	https://lore.kernel.org/lkml/Z9G5zEOcTdGKm7Ei@hovoldconsulting.com/
> > 
> > the ath11k and ath12k appear to be missing a number of memory barriers
> > that are required on weakly ordered architectures like aarch64 to avoid
> > memory corruption issues.
> > 
> > Here's a fix for one more such case which people already seem to be
> > hitting.
> > 
> > Note that I've seen one "msdu_done" bit not set warning also with this
> > patch so whether it helps with that at all remains to be seen. I'm CCing
> > Jens and Steev that see these warnings frequently and that may be able
> > to help out with testing.
> 
> Before this patch I was seeing this "msdu_done bit" an average of about 
> 40 times per hour... e.g. a recent boot period of 43hrs saw 1600 of 
> these msgs. I've been testing this patch for about 10 hours now 
> connected to the same network etc, and haven't seen this "msdu_done bit" 
> message once. So, even if it's not completely resolving this for 
> everyone, it seems to be a huge improvement for me.
> 
> 0006:01:00.0 Network controller: Qualcomm Technologies, Inc QCNFA765 
> Wireless Network Adapter (rev 01)
> ath11k_pci 0006:01:00.0: chip_id 0x2 chip_family 0xb board_id 0x8c 
> soc_id 0x400c0210
> ath11k_pci 0006:01:00.0: fw_version 0x11088c35 fw_build_timestamp 
> 2024-04-17 08:34 fw_build_id 
> WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41
> 
> Tested-by: Clayton Craft <clayton@...ftyguy.net>

Thanks for testing and confirming my suspicion.

Johan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ