linux-kernel - Re: ath12k: REO status on PPC does not work

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aKbCq3SLsBQcQWIh@FUE-ALEWI-WINX>
Date: Thu, 21 Aug 2025 08:54:35 +0200
From: Alexander Wilhelm <alexander.wilhelm@...termo.com>
To: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@....qualcomm.com>
Cc: Baochen Qiang <baochen.qiang@....qualcomm.com>,
        Jeff Johnson <jjohnson@...nel.org>, ath12k@...ts.infradead.org,
        linux-wireless@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: ath12k: REO status on PPC does not work

Am Tue, Aug 19, 2025 at 02:51:16PM +0530 schrieb Vasanthakumar Thiagarajan:
> 
> 
> On 8/19/2025 1:40 PM, Alexander Wilhelm wrote:
> > Am Tue, Aug 19, 2025 at 03:26:34PM +0800 schrieb Baochen Qiang:
> > > 
> > > 
> > > On 8/19/2025 2:59 PM, Alexander Wilhelm wrote:
> > > > Am Tue, Aug 19, 2025 at 02:38:38PM +0800 schrieb Baochen Qiang:
> > > > > 
> > > > > 
> > > > > On 8/15/2025 4:13 PM, Alexander Wilhelm wrote:
> > > > > > Hello devs,
> > > > > > 
> > > > > > I'm currently working on getting the 'ath12k' driver running on a big endian
> > > > > > PowerPC platform and have encountered the following issue.
> > > > > > 
> > > > > > In the function 'ath12k_dp_rx_process_reo_status', the REO status is determined
> > > > > > by inspecting memory that the hardware has previously written via DMA.
> > > > > > Specifically, during the call to 'ath12k_hal_srng_access_begin', the driver
> > > > > > reads the value of 'hp_addr' for the destination ring (in my case, always with
> > > > > > ID 21). On the big endian platform, this value is consistently 0, which prevents
> > > > > > the REO status from being updated.
> > > > > 
> > > > > This does not seem an endian issue to me, because either of them we should get a value
> > > > > other than 0.
> > > > 
> > > > Really? I always assumed the value remains 0 until the firmware writes something
> > > > to memory and moves the head pointer of the SRNG ring buffer. By the way, I've
> > > 
> > > correct!
> > > 
> > > > already implemented the missing endianness conversions for reading from and
> > > > writing to ring buffer pointers like this one:
> > > > 
> > > >      hp = le32_to_cpu(*srng->u.dst_ring.hp_addr);
> > > 
> > > I was actually meaning that, when hp get updated by firmware, either with or without
> > > le32_to_cpu conversion, we should get a value other than 0.
> > > 
> > > So in your cause I am suspecting that hardware/firmware has never sent any REO status to
> > > host.
> > 
> > Yes, I see it the same way.
> > 
> > > > > > Interestingly, DMA read/write accesses work fine for other rings, just not for
> > > > > > this one. What makes the REO status ring so special? I couldn’t find anything in
> > > > > > the initialization routine that would explain the difference.
> > > > > > 
> > > > > > Could anyone give me a hint on what I should be looking for?
> > > > > > 
> > > > > > 
> > > > > What hardware are you using? WCN7850 or QCN9274?
> > > > 
> > > > I'm using QCN9274-based dualmac modules.
> > > 
> > > sure
> > > 
> > > > > 
> > > > Best regards
> > > > Alexander Wilhelm
> > > 
> > > so did you see any obvious issue?
> > 
> > For example, in the function 'ath12k_dp_rx_peer_tid_delete', the function
> > 'ath12k_dp_reo_cmd_send' is called, which in turn registers the function
> > 'ath12k_dp_rx_tid_del_func' as a callback. On PowerPC, this callback function is
> > never invoked, which eventually leads to the following error:
> > 
> >      ath12k_pci 0002:01:00.0: failed to send HAL_REO_CMD_UPDATE_RX_QUEUE cmd, tid 15 (-105)
> >      ath12k_pci 0002:01:00.0: failed to update rx tid queue, tid 0 (-105)
> >      ath12k_pci 0002:01:00.0: failed to update reo for rx tid 0
> >      ath12k_pci 0002:01:00.0: failed to setup rx tid -105
> >      ath12k_pci 0002:01:00.0: pdev idx 0 unable to perform ampdu action 0 ret -105
> > 
> > My investigations have shown that a cache flush is supposed to happen at some
> > point, e.g. after a timeout or when a threshold of 64 is reached. Since this
> > does not happen, I encounter errors after the 127th 'cmd_num'. This callback
> > function should actually be called from the 'reo_cmd_list' within the function
> > 'ath12k_dp_rx_process_reo_status'. However, this does not happen because the
> > pointer is always 0.
> > 
> > I hope I was able to explain clearly what I was able to trace. Please correct me
> > if any of my assumptions are wrong.
> 
> Your understanding is mostly correct. it is also possible that there may be something
> missing in REO_CMD ring (setup and cmd_send) which shows symptom like this in
> REO_STATUS ring processing. If other src and dst rings are working fine,
> REO_CMD/STATUS rings also are expected to work. Pls check src and dst ring
> setup path once again.

Thanks you for the hint regarding the REO_CMD ring, it helped me track down the
issue. The problem was with the 'tl' field in the 'hal_tlv_64_hdr' structure: it
should be declared as '__le64' instead of 'u64', similar to the '__le32 tl'
field in 'hal_tlv_hdr'. As a result, some necessary endianness conversions were
missing.


Best regards
Alexander Wilhelm