[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5268c9ba-16cf-4d3a-87df-bbe0ddd3d584@quicinc.com>
Date: Tue, 3 Jun 2025 18:52:37 +0800
From: Baochen Qiang <quic_bqiang@...cinc.com>
To: Johan Hovold <johan@...nel.org>, Miaoqing Pan <quic_miaoqing@...cinc.com>
CC: Johan Hovold <johan+linaro@...nel.org>,
Jeff Johnson
<jjohnson@...nel.org>, <linux-wireless@...r.kernel.org>,
<ath11k@...ts.infradead.org>, <linux-kernel@...r.kernel.org>,
<stable@...r.kernel.org>
Subject: Re: [PATCH 1/3] wifi: ath11k: fix dest ring-buffer corruption
On 6/2/2025 4:03 PM, Johan Hovold wrote:
> On Thu, May 29, 2025 at 03:03:38PM +0800, Miaoqing Pan wrote:
>> On 5/26/2025 7:48 PM, Johan Hovold wrote:
>>> Add the missing memory barriers to make sure that destination ring
>>> descriptors are read after the head pointers to avoid using stale data
>>> on weakly ordered architectures like aarch64.
>
>>> @@ -3851,6 +3851,9 @@ int ath11k_dp_process_rx_err(struct ath11k_base *ab, struct napi_struct *napi,
>>>
>>> ath11k_hal_srng_access_begin(ab, srng);
>>>
>>> + /* Make sure descriptor is read after the head pointer. */
>>> + dma_rmb();
>>> +
>>
>> Thanks Johan, for continuing to follow up on this issue. I have some
>> different opinions.
>>
>> This change somewhat deviates from the fix approach described in
>> https://lore.kernel.org/all/20250321095219.19369-1-johan+linaro@kernel.org/.
>> In this case, the descriptor might be accessed before it is updated or
>> while it is still being updated. Therefore, a dma_rmb() should be added
>> after the call to ath11k_hal_srng_dst_get_next_entry() and before
>> accessing ath11k_hal_ce_dst_status_get_length(), to ensure that the DMA
>> has completed before reading the descriptor.
>>
>> However, in this patch, the memory barrier is used to protect the head
>> pointer (HP). I don't think a memory barrier is necessary for HP,
>> because even if an outdated HP is fetched,
>> ath11k_hal_srng_dst_get_next_entry() will return NULL and exit safely.
>
> No, the barrier is needed between reading the head pointer and accessing
> descriptor fields, that's what matters.
>
> You can still end up with reading stale descriptor data even when
> ath11k_hal_srng_dst_get_next_entry() returns non-NULL due to speculation
> (that's what happens on the X13s).
The fact is that a dma_rmb() does not even prevent speculation, no matter where it is
placed, right? If so the whole point of dma_rmb() is to prevent from compiler reordering
or CPU reordering, but is it really possible?
The sequence is
1# reading HP
srng->u.dst_ring.cached_hp = READ_ONCE(*srng->u.dst_ring.hp_addr);
2# validate HP
if (srng->u.dst_ring.tp == srng->u.dst_ring.cached_hp)
return NULL;
3# get desc
desc = srng->ring_base_vaddr + srng->u.dst_ring.tp;
4# accessing desc
ath11k_hal_desc_reo_parse_err(... desc, ...)
Clearly each step depends on the results of previous steps. In this case the compiler/CPU
is expected to be smart enough to not do any reordering, isn't it?
>
> Whether to place it before or after (or inside)
> ath11k_hal_srng_dst_get_next_entry() is a trade off between readability,
> maintainability and whether we want to avoid unnecessary barriers in
> cases like the above where we strictly only need one barrier before the
> loop (or if we want to avoid the barrier in case the ring is ever
> empty).
>
>> So, placing the memory barrier inside
>> ath11k_hal_srng_dst_get_next_entry() would be more appropriate.
>>
>> @@ -678,6 +678,8 @@ u32 *ath11k_hal_srng_dst_get_next_entry(struct
>> ath11k_base *ab,
>> if (srng->flags & HAL_SRNG_FLAGS_CACHED)
>> ath11k_hal_srng_prefetch_desc(ab, srng);
>>
>> + dma_rmb();
>> +
>> return desc;
>> }
>
> So this will add a barrier in each iteration of the loop, but we only
> need a single one after reading the head pointer.
>
> [ Also note that ath11k_hal_srng_dst_peek() would similarly need a
> barrier if we were to move them into those helpers. ]
>
> Johan
>
Powered by blists - more mailing lists