[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <63c2c889-0a09-4b25-a1d2-2aaa92d75d9f@oss.qualcomm.com>
Date: Mon, 16 Jun 2025 14:59:24 +0530
From: Praneesh P <praneesh.p@....qualcomm.com>
To: Baochen Qiang <quic_bqiang@...cinc.com>, Johan Hovold <johan@...nel.org>,
P Praneesh <quic_ppranees@...cinc.com>
Cc: Johan Hovold <johan+linaro@...nel.org>,
Jeff Johnson
<jjohnson@...nel.org>,
Miaoqing Pan <quic_miaoqing@...cinc.com>,
Remi Pommarel <repk@...plefau.lt>, linux-wireless@...r.kernel.org,
ath12k@...ts.infradead.org, linux-kernel@...r.kernel.org,
stable@...r.kernel.org
Subject: Re: [PATCH v2 1/4] wifi: ath12k: fix dest ring-buffer corruption
On 6/5/2025 4:19 PM, Baochen Qiang wrote:
>
> On 6/5/2025 6:00 PM, Johan Hovold wrote:
>> On Thu, Jun 05, 2025 at 04:41:32PM +0800, Baochen Qiang wrote:
>>> On 6/4/2025 10:45 PM, Johan Hovold wrote:
>>>> Add the missing memory barrier to make sure that destination ring
>>>> descriptors are read after the head pointers to avoid using stale data
>>>> on weakly ordered architectures like aarch64.
>>>>
>>>> The barrier is added to the ath12k_hal_srng_access_begin() helper for
>>>> symmetry with follow-on fixes for source ring buffer corruption which
>>>> will add barriers to ath12k_hal_srng_access_end().
>>>>
>>>> Note that this may fix the empty descriptor issue recently worked around
>>>> by commit 51ad34a47e9f ("wifi: ath12k: Add drop descriptor handling for
>>>> monitor ring").
>>> why? I would expect drunk cookies are valid in case of HAL_MON_DEST_INFO0_EMPTY_DESC,
>>> rather than anything caused by reordering.
>> Based on a quick look it seemed like this could possibly fall in the
>> same category as some of the other workarounds I've spotted while
>> looking into these ordering issues (e.g. f9fff67d2d7c ("wifi: ath11k:
>> Fix SKB corruption in REO destination ring")).
>>
>> If you say this one is clearly unrelated, I'll drop the comment.
> Praneesh, could you comment here since you made that change?
Empty/Drop descriptor is intentionally issued by the hardware during
backpressure scenario
and is unrelated to the issue discussed in this patch series.
>>>> @@ -343,9 +343,6 @@ static int ath12k_ce_completed_recv_next(struct ath12k_ce_pipe *pipe,
>>>> goto err;
>>>> }
>>>>
>>>> - /* Make sure descriptor is read after the head pointer. */
>>>> - dma_rmb();
>>>> -
>>>> *nbytes = ath12k_hal_ce_dst_status_get_length(desc);
>>>>
>>>> *skb = pipe->dest_ring->skb[sw_index];
>>>> diff --git a/drivers/net/wireless/ath/ath12k/hal.c b/drivers/net/wireless/ath/ath12k/hal.c
>>>> index 91d5126ca149..9eea13ed5565 100644
>>>> --- a/drivers/net/wireless/ath/ath12k/hal.c
>>>> +++ b/drivers/net/wireless/ath/ath12k/hal.c
>>>> @@ -2126,13 +2126,24 @@ void *ath12k_hal_srng_src_get_next_reaped(struct ath12k_base *ab,
>>>>
>>>> void ath12k_hal_srng_access_begin(struct ath12k_base *ab, struct hal_srng *srng)
>>>> {
>>>> + u32 hp;
>>>> +
>>>> lockdep_assert_held(&srng->lock);
>>>>
>>>> - if (srng->ring_dir == HAL_SRNG_DIR_SRC)
>>>> + if (srng->ring_dir == HAL_SRNG_DIR_SRC) {
>>>> srng->u.src_ring.cached_tp =
>>>> *(volatile u32 *)srng->u.src_ring.tp_addr;
>>>> - else
>>>> - srng->u.dst_ring.cached_hp = READ_ONCE(*srng->u.dst_ring.hp_addr);
>>>> + } else {
>>>> + hp = READ_ONCE(*srng->u.dst_ring.hp_addr);
>>>> +
>>>> + if (hp != srng->u.dst_ring.cached_hp) {
>>> This consumes additional CPU cycles in hot path, which is a concern to me.
>>>
>>> Based on that, I prefer the v1 implementation.
>> The conditional avoids a memory barrier in case the ring is empty, so
>> for all callers but ath12k_ce_completed_recv_next() it's an improvement
>> over v1 in that sense.
>>
>> I could make the barrier unconditional, which will only add one barrier
>> to ath12k_ce_completed_recv_next() in case the ring is empty compared to
>> v1. Perhaps that's a good compromise if you worry about the extra
>> comparison?
> I guess the unconditional barrier also has impact on performance? If so I am not sure
> which one is better then ...
>
> Let's just keep it as is and see what others think.
>
>> I very much want to avoid having both explicit barriers in the caller
>> and barriers in the hal end() helper. I think it should be either or.
>>
>>>> + srng->u.dst_ring.cached_hp = hp;
>>>> + /* Make sure descriptor is read after the head
>>>> + * pointer.
>>>> + */
>>>> + dma_rmb();
>>>> + }
>>>> + }
>> Johan
>
Powered by blists - more mailing lists