[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4d87ef88-3533-4255-adc6-6c268818fe25@collabora.com>
Date: Fri, 25 Apr 2025 12:42:38 +0500
From: Muhammad Usama Anjum <usama.anjum@...labora.com>
To: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
Cc: Johannes Berg <johannes@...solutions.net>,
Jeff Johnson <jjohnson@...nel.org>, Jeffrey Hugo <quic_jhugo@...cinc.com>,
Yan Zhen <yanzhen@...o.com>, Youssef Samir <quic_yabdulra@...cinc.com>,
Qiang Yu <quic_qianyu@...cinc.com>, Alex Elder <elder@...nel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Kunwu Chan <chentao@...inos.cn>, kernel@...labora.com, mhi@...ts.linux.dev,
linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-wireless@...r.kernel.org, ath11k@...ts.infradead.org
Subject: Re: [PATCH v2] bus: mhi: host: don't free bhie tables during
suspend/hibernation
On 4/25/25 12:32 PM, Manivannan Sadhasivam wrote:
> On Fri, Apr 25, 2025 at 12:14:39PM +0500, Muhammad Usama Anjum wrote:
>> On 4/25/25 12:04 PM, Manivannan Sadhasivam wrote:
>>> On Thu, Apr 10, 2025 at 07:56:54PM +0500, Muhammad Usama Anjum wrote:
>>>> Fix dma_direct_alloc() failure at resume time during bhie_table
>>>> allocation. There is a crash report where at resume time, the memory
>>>> from the dma doesn't get allocated and MHI fails to re-initialize.
>>>> There may be fragmentation of some kind which fails the allocation
>>>> call.
>>>>
>>>
>>> If dma_direct_alloc() fails, then it is a platform limitation/issue. We cannot
>>> workaround that in the device drivers. What is the guarantee that other drivers
>>> will also continue to work? Will you go ahead and patch all of them which
>>> release memory during suspend?
>>>
>>> Please investigate why the allocation fails. Even this is not a device issue, so
>>> we cannot add quirks :/
>> This isn't a platform specific quirk. We are only hitting it because
>> there is high memory pressure during suspend/resume. This dma allocation
>> failure can happen with memory pressure on any device.
>>
>
> Yes.
Thanks for understanding.
>
>> The purpose of this patch is just to make driver more robust to memory
>> pressure during resume.
>>
>> I'm not sure about MHI. But other drivers already have such patches as
>> dma_direct_alloc() is susceptible to failures when memory pressure is
>> high. This patch was motivated from ath12k [1] and ath11k [2].
>>
>
> Even if we patch the MHI driver, the issue is going to trip some other driver.
> How does the DMA memory goes low during resume? So some other driver is
> consuming more than it did during probe()?
Think it like this. The first probe happens just after boot. Most of the
RAM was empty. Then let's say user launches applications which not only
consume entire RAM but also the Swap. The DMA memory area is the first
~4GB on x86_64 (if I'm not mistaken). Now at resume time when we want to
allocate memory from dma, it may not be available entirely or because of
fragmentation we cannot allocate that much contiguous memory.
In our testing and real world cases, right now only wifi driver is
misbehaving. Wifi is also very important. So we are hoping to make wifi
driver robust.
>
>> [1]
>> https://lore.kernel.org/all/20240419034034.2842-1-quic_bqiang@quicinc.com/
>> [2]
>> https://lore.kernel.org/all/20220506141448.10340-1-quic_akolli@quicinc.com/
>>
>> What do you think can be the way forward for this patch?
>>
>
> Let's try first to analyze why the memory pressure happens during suspend. As I
> can see, even if we fix the MHI driver, you are likely to hit this issue
> somewhere else.>
> - Mani
>
>>>
>
> [...]
>
>>> Did you intend to leak this information? If not, please remove it from
>>> stacktrace.
>> The device isn't private. Its fine.
>>
>
> Okay.
>
> - Mani
>
--
Regards,
Usama
Powered by blists - more mailing lists