[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <79b4bac1-6e55-408c-a334-006eded4229f@quicinc.com>
Date: Wed, 4 Jun 2025 15:57:57 +0800
From: Miaoqing Pan <quic_miaoqing@...cinc.com>
To: Johan Hovold <johan@...nel.org>
CC: Baochen Qiang <quic_bqiang@...cinc.com>,
Johan Hovold
<johan+linaro@...nel.org>,
Jeff Johnson <jjohnson@...nel.org>, <linux-wireless@...r.kernel.org>,
<ath11k@...ts.infradead.org>, <linux-kernel@...r.kernel.org>,
<stable@...r.kernel.org>
Subject: Re: [PATCH 1/3] wifi: ath11k: fix dest ring-buffer corruption
On 6/4/2025 3:06 PM, Johan Hovold wrote:
> On Wed, Jun 04, 2025 at 01:32:08PM +0800, Miaoqing Pan wrote:
>> On 6/4/2025 10:34 AM, Miaoqing Pan wrote:
>>> On 6/3/2025 7:51 PM, Johan Hovold wrote:
>>>> On Tue, Jun 03, 2025 at 06:52:37PM +0800, Baochen Qiang wrote:
>>>>> On 6/2/2025 4:03 PM, Johan Hovold wrote:
>>>>
>>>>>> No, the barrier is needed between reading the head pointer and
>>>>>> accessing
>>>>>> descriptor fields, that's what matters.
>>>>>>
>>>>>> You can still end up with reading stale descriptor data even when
>>>>>> ath11k_hal_srng_dst_get_next_entry() returns non-NULL due to
>>>>>> speculation
>>>>>> (that's what happens on the X13s).
>>>>>
>>>>> The fact is that a dma_rmb() does not even prevent speculation, no
>>>>> matter where it is
>>>>> placed, right?
>>>>
>>>> It prevents the speculated load from being used.
>>>>
>>>>> If so the whole point of dma_rmb() is to prevent from compiler
>>>>> reordering
>>>>> or CPU reordering, but is it really possible?
>>>>>
>>>>> The sequence is
>>>>>
>>>>> 1# reading HP
>>>>> srng->u.dst_ring.cached_hp = READ_ONCE(*srng-
>>>>>> u.dst_ring.hp_addr);
>>>>>
>>>>> 2# validate HP
>>>>> if (srng->u.dst_ring.tp == srng->u.dst_ring.cached_hp)
>>>>> return NULL;
>>>>>
>>>>> 3# get desc
>>>>> desc = srng->ring_base_vaddr + srng->u.dst_ring.tp;
>>>>>
>>>>> 4# accessing desc
>>>>> ath11k_hal_desc_reo_parse_err(... desc, ...)
>>>>>
>>>>> Clearly each step depends on the results of previous steps. In this
>>>>> case the compiler/CPU
>>>>> is expected to be smart enough to not do any reordering, isn't it?
>>>>
>>>> Steps 3 and 4 can be done speculatively before the load in step 1 is
>>>> complete as long as the result is discarded if it turns out not to be
>>>> needed.
>
>>> If the condition in step 2 is true and step 3 speculatively loads
>>> descriptor from TP before step 1, could this cause issues?
>>
>> Sorry for typo, if the condition in step 2 is false and step 3
>> speculatively loads descriptor from TP before step 1, could this cause
>> issues?
>
> Almost correct; the descriptor can be loaded (from TP) before the head
> pointer is loaded and thus before the condition in step 2 has been
> evaluated. And if the condition in step 2 later turns out to be false,
> step 4 may use stale data from before the head pointer was updated.
>
Actually, there's a missing step between step 3 and step 4: TP+1.
TP+1:
srng->u.dst_ring.tp += srng->entry_size
TP is managed by the CPU and points to the current first unprocessed
descriptor, while HP and the descriptor are asynchronously updated by
DMA. So are you saying that the descriptor obtained through speculative
loading has not yet been updated, or is in the process of being updated?
Powered by blists - more mailing lists