[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <38fdb314-da3c-400f-98a4-88937717619f@quicinc.com>
Date: Wed, 23 Apr 2025 14:47:55 +0800
From: Baochen Qiang <quic_bqiang@...cinc.com>
To: Muhammad Usama Anjum <usama.anjum@...labora.com>,
Jeff Johnson
<jjohnson@...nel.org>, Kalle Valo <kvalo@...nel.org>,
Anilkumar Kolli
<quic_akolli@...cinc.com>
CC: <kernel@...labora.com>, <linux-wireless@...r.kernel.org>,
<ath11k@...ts.infradead.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] wifi: ath11k: Fix memory reuse logic
On 4/23/2025 2:26 PM, Muhammad Usama Anjum wrote:
> On 4/22/25 1:01 PM, Baochen Qiang wrote:
>>
>>
>> On 4/22/2025 3:46 PM, Muhammad Usama Anjum wrote:
>>> Hi,
>>>
>>> Thank you for excellent review.
>>>
>>> On 4/22/25 7:15 AM, Baochen Qiang wrote:
>>>>
>>>>
>>>> On 4/18/2025 8:09 PM, Muhammad Usama Anjum wrote:
>>>>> Firmware requests 2 segments at first. 1st segment is of 6799360 whose
>>>>> allocation fails and we return success to firmware. Then firmware asks
>>>>
>>>> Host won't fail in case DMA remapping is enabled. Better to rephrase to make it clear that
>>>> the big segment allocation fails in case DMA remapping is not working, usually due to
>>>> IOMMU not present or any necessary kernel config not enabled.
>>> IOMMU is turned off. I'll make description better.
>>>
>>>>
>>>>> for 22 smaller segments. Those get allocated. At suspend/hibernation
>>>>> time, these segments aren't freed as they are reused by firmware.
>>>>>
>>>>> After resume the firmware asks for 2 segments again with first segment
>>>>> of 6799360 and with same vaddr of the first smaller segment which we had
>>>>
>>>> Not follow you here. What do you mean by 'same vaddr'? firmware does not care about vaddr
>>>> at all.
>>> So we get request to allocate memory of size = 6799360 and vaddr =
>>> 0xABC). We fail it. Then we get request to allocate memory of size =
>>> 500000 and vaddr is same 0xABC which gets allocated successfully.
>>>
>>> When we resume, firmware asks again for 6799360 with 0xABC vaddr even
>>> though we had allocated memory of 500000 size at 0xABC. I'm referring to
>>> this vaddr that its same.
>>
>> OK, get your point. But like I said, firmware doesn't case about vaddr, so it is not
>> asking for a 'same vaddr'.
>>
>> IMO just mentioning vaddr is not NULL is sufficient.
> Okay. I'll update the description to avoid confusion.
>
>>
>>>
>>>>
>>>>> allocated. Hence vaddr isn't NULL and we compare the type and size if it
>>>>> can be reused. Unfornately, we detect that we cannot reuse it and this
>>>>
>>>> s/Unfornately/Unfortunately/
>>>>
>>>>> first smaller segment is freed. Then we continue to allocate 6799360 size
>>>>> memory from dma which fails and we call ath11k_qmi_free_target_mem_chunk()
>>>>> which frees the second smaller segment as well. Later success is returned
>>>>> to firmware which asks for 22 smaller segments again. But as we had freed
>>>>> 2 segments already, we'll allocate the first 2 new segments again and
>>>>> reuse the remaining 20.
>>>>>
>>>>> This patch is correctiong the skip logic when vaddr is set, but size/type
>>>>
>>>> s/correctiong/correcting/
>>>>
>>>>> don't match. In this case, we should use the same skip and success logic
>>>>> as used when dma_alloc_coherent fails without freeing the memory area.
>>>>>
>>>>> We had got reports that memory allocation in this function failed at
>>>>
>>>> any public link to the report?
>>> There's no public report. I've attached the logs. You'll find following
>>> error logs in it:
>>>
>>> ath11k_pci 0000:03:00.0: failed to allocate dma memory for qmi (524288 B
>>> type 1)
>>> ath11k_pci 0000:03:00.0: failed to allocate qmi target memory: -22
>>>
>>>
>>>>
>>>>> resume which made us debug why the reuse logic is wrong. Those failures
>>>>> weren't because of the bigger chunk allocation failure as they are
>>>>> skipped. Rather these failures were because of smaller chunk allocation
>>>>> failures. This patch fixes freeing and allocation of 2 smaller chunks.
>>>>
>>>> any you saying kernels fail to alloc a smaller chunk? why? is system memory exhausted or
>>>> too fragmented?
>>> Yes, the smaller chunk doesn't get allocated. I've not been able to
>>> reproduce it on my setup. Both system memory exhaustion and
>>> fragmentation are the suspects.
>>
>> so it is kernel failing to allocate the buffer, not any issue in ath12k leading to this.
>> Please help make this clear to avoid confusion.
> We caught the bug as kernel was unable to allocate memory at resume.
> Later found out that with the optimization in ath11k, we shouldn't be
> trying to allocate memory in the first place. That's why I've sent this
> patch.
>
yeah, understand that.
> Let me update the description and send v2.
>
looking forward to it.
>
Powered by blists - more mailing lists