[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <90077fb2-42e0-42ab-b5a1-b4b73ff201d5@quicinc.com>
Date: Thu, 1 Aug 2024 11:04:18 -0700
From: Jeff Johnson <quic_jjohnson@...cinc.com>
To: Kalle Valo <kvalo@...nel.org>, Baochen Qiang <quic_bqiang@...cinc.com>
CC: <ath12k@...ts.infradead.org>, <linux-wireless@...r.kernel.org>,
<kernel@...cinc.com>, <netdev@...r.kernel.org>
Subject: Re: [PATCH] wifi: ath12k: use 128 bytes aligned iova in transmit path
for WCN7850
On 8/1/2024 8:07 AM, Kalle Valo wrote:
> Baochen Qiang <quic_bqiang@...cinc.com> writes:
>
>> In transmit path, it is likely that the iova is not aligned to PCIe TLP
>> max payload size, which is 128 for WCN7850. Normally in such cases hardware
>> is expected to split the packet into several parts in a manner such that
>> they, other than the first one, have aligned iova. However due to hardware
>> limitations, WCN7850 does not behave like that properly with some specific
>> unaligned iova in transmit path. This easily results in target hang in a
>> KPI transmit test: packet send/receive failure, WMI command send timeout
>> etc. Also fatal error seen in PCIe level:
>>
>> ...
>> Capabilities: ...
>> ...
>> DevSta: ... FatalErr+ ...
>> ...
>> ...
>>
>> Work around this by manually moving/reallocating payload buffer such that
>> we can map it to a 128 bytes aligned iova. The moving requires sufficient
>> head room or tail room in skb: for the former we can do ourselves a favor
>> by asking some extra bytes when registering with mac80211, while for the
>> latter we can do nothing.
>>
>> Moving/reallocating buffer consumes additional CPU cycles, but the good news
>> is that an aligned iova increases PCIe efficiency. In my tests on some X86
>> platforms the KPI results are almost consistent.
>>
>> Since this is seen only with WCN7850, add a new hardware parameter to
>> differentiate from others.
>>
>> Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
>>
>> Signed-off-by: Baochen Qiang <quic_bqiang@...cinc.com>
>
> [...]
>
>> --- a/drivers/net/wireless/ath/ath12k/dp_tx.c
>> +++ b/drivers/net/wireless/ath/ath12k/dp_tx.c
>> @@ -162,6 +162,60 @@ static int ath12k_dp_prepare_htt_metadata(struct sk_buff *skb)
>> return 0;
>> }
>>
>> +static void ath12k_dp_tx_move_payload(struct sk_buff *skb,
>> + unsigned long delta,
>> + bool head)
>> +{
>> + unsigned long len = skb->len;
>> +
>> + if (head) {
>> + skb_push(skb, delta);
>> + memmove(skb->data, skb->data + delta, len);
>> + skb_trim(skb, len);>> + } else {
>> + skb_put(skb, delta);
>> + memmove(skb->data + delta, skb->data, len);
>> + skb_pull(skb, delta);
>> + }
>> +}
>
> I'm nitpicking, but usually booleans like the head variable here don't
> help with readability. Having two separate functions would be easier to
> read, but this is fine as it's so small.
>
>> @@ -279,6 +334,23 @@ int ath12k_dp_tx(struct ath12k *ar, struct ath12k_vif *arvif,
>> goto fail_remove_tx_buf;
>> }
>>
>> + if (iova_mask &&
>> + (unsigned long)skb->data & iova_mask) {
>> + ret = ath12k_dp_tx_align_payload(ab, &skb);
>> + if (ret) {
>> + dev_warn_once(ab->dev, "failed to align TX buffer %d\n", ret);
>
> Why dev_warn_once()? I changed it to ath12k_warn() in the pending
> branch.
My concern was that if this is an ongoing issue that you'd end up spamming the
kernel log. But I guess the rate limiting will reduce the spam to no more than
10 logs in a 5 second interval
>
>> --- a/drivers/net/wireless/ath/ath12k/hw.h
>> +++ b/drivers/net/wireless/ath/ath12k/hw.h
>> @@ -96,6 +96,8 @@
>> #define ATH12K_M3_FILE "m3.bin"
>> #define ATH12K_REGDB_FILE_NAME "regdb.bin"
>>
>> +#define PCIE_MAX_PAYLOAD_SIZE 128
>
> PCIE prefix implies that this is in PCI subsystem. I renamed it to
> ATH12K_PCIE_MAX_PAYLOAD_SIZE.
>
> Please check my changes:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/ath/ath.git/commit/?h=pending&id=b603c1e0d94fb1eb0576ef48ebe37c8c1ce86328
Acked-by: Jeff Johnson <quic_jjohnson@...cinc.com>
Powered by blists - more mailing lists