[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7cf2f4e1-8a85-4f07-a886-b47159c92ccc@collabora.com>
Date: Wed, 23 Apr 2025 11:41:32 +0500
From: Muhammad Usama Anjum <usama.anjum@...labora.com>
To: Jeff Hugo <jeff.hugo@....qualcomm.com>,
Krishna Chaitanya Chundru <quic_krichai@...cinc.com>,
Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>,
Johannes Berg <johannes@...solutions.net>, Jeff Johnson
<jjohnson@...nel.org>, Jeffrey Hugo <quic_jhugo@...cinc.com>,
Yan Zhen <yanzhen@...o.com>, Youssef Samir <quic_yabdulra@...cinc.com>,
Qiang Yu <quic_qianyu@...cinc.com>, Alex Elder <elder@...nel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Kunwu Chan <chentao@...inos.cn>, quic_kvalo@...cinc.com
Cc: kernel@...labora.com, mhi@...ts.linux.dev, linux-arm-msm@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-wireless@...r.kernel.org,
ath11k@...ts.infradead.org
Subject: Re: [PATCH v2] bus: mhi: host: don't free bhie tables during
suspend/hibernation
On 4/22/25 7:22 PM, Jeff Hugo wrote:
> On 4/22/2025 1:23 AM, Muhammad Usama Anjum wrote:
>> On 4/18/25 7:08 PM, Jeff Hugo wrote:
>>> On 4/18/2025 2:10 AM, Muhammad Usama Anjum wrote:
>>>> On 4/14/25 7:14 PM, Jeff Hugo wrote:
>>>>> On 4/14/2025 1:32 AM, Muhammad Usama Anjum wrote:
>>>>>> On 4/12/25 6:22 AM, Krishna Chaitanya Chundru wrote:
>>>>>>>
>>>>>>> On 4/12/2025 12:02 AM, Muhammad Usama Anjum wrote:
>>>>>>>> On 4/11/25 1:39 PM, Krishna Chaitanya Chundru wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 4/11/2025 12:32 PM, Muhammad Usama Anjum wrote:
>>>>>>>>>> On 4/11/25 8:37 AM, Krishna Chaitanya Chundru wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 4/10/2025 8:26 PM, Muhammad Usama Anjum wrote:
>>>>>>>>>>>> Fix dma_direct_alloc() failure at resume time during bhie_table
>>>>>>>>>>>> allocation. There is a crash report where at resume time, the
>>>>>>>>>>>> memory
>>>>>>>>>>>> from the dma doesn't get allocated and MHI fails to re-
>>>>>>>>>>>> initialize.
>>>>>>>>>>>> There may be fragmentation of some kind which fails the
>>>>>>>>>>>> allocation
>>>>>>>>>>>> call.
>>>>>>>>>>>>
>>>>>>>>>>>> To fix it, don't free the memory at power down during suspend /
>>>>>>>>>>>> hibernation. Instead, use the same allocated memory again after
>>>>>>>>>>>> every
>>>>>>>>>>>> resume / hibernation. This patch has been tested with resume
>>>>>>>>>>>> and
>>>>>>>>>>>> hibernation both.
>>>>>>>>>>>>
>>>>>>>>>>>> The rddm is of constant size for a given hardware. While the
>>>>>>>>>>>> fbc_image
>>>>>>>>>>>> size depends on the firmware. If the firmware changes, we'll
>>>>>>>>>>>> free and
>>>>>>>>>>> If firmware image will change between suspend and resume ?
>>>>>>>>>> Yes, correct.
>>>>>>>>>>
>>>>>>>>> why the firmware image size will change between suspend & resume?
>>>>>>>>> who will update the firmware image after bootup?
>>>>>>>>> It is not expected behaviour.
>>>>>>>> I was trying to research if the firmware can change or not. I've
>>>>>>>> not
>>>>>>>> found any documentation on it.
>>>>>>>>
>>>>>>>> If the firmare is updated in filesystem before suspend/hibernate,
>>>>>>>> would
>>>>>>>> the new firwmare be loaded the next time kernel resumes as the
>>>>>>>> older
>>>>>>>> firmware is no where to be found?
>>>>>>>>
>>>>>>>> What do you think about this?
>>>>>>>>
>>>>>>> I don't think firmware can be updated before suspend/hibernate. I
>>>>>>> don't
>>>>>>> see any reason why it can be updated. If you think it can be updated
>>>>>>> please quote relevant doc.
>>>>>> I've not found any documentation on it. Let's wait for others to
>>>>>> review
>>>>>> and it it cannot be updated, I'll remove this part.
>>>>>>
>>>>>
>>>>> Wouldn't this be trivial to test? Boot the device, go modify the
>>>>> firmware on the filesystem, then go through a suspend cycle.
>>>> I just tested this. I've used an old firmware from last year vs the
>>>> latest one.
>>>>
>>>> Firmware A: old firmware size: 5349376
>>>> Firmware B: new firmware size: 5165056
>>>>
>>>> A here has bigger size.
>>>>
>>>> 1. I loaded A at boot and then replaced the firmwares in filesystem
>>>> with
>>>> B before syspend. At resume time, B was loaded fine by freeing the
>>>> bigger memory area and allocating the smaller one.
>>>>
>>>> 2. I loaded B and then replaced A in its place before suspend. At
>>>> resume
>>>> time, memory was freed and larger memory was allocated. But driver
>>>> wasn't able to initialize correctly:
>>>>
>>>> [ 184.051902] ath11k_pci 0000:03:00.0: timeout while waiting for
>>>> restart complete
>>>> [ 184.051916] ath11k_pci 0000:03:00.0: failed to resume core: -110
>>>> [ 184.051923] ath11k_pci 0000:03:00.0: PM: dpm_run_callback():
>>>> pci_pm_resume returns -110
>>>> [ 184.051945] ath11k_pci 0000:03:00.0: PM: failed to resume async:
>>>> error -110
>>>> [ 187.251911] ath11k_pci 0000:03:00.0: wmi command 16387 timeout
>>>> [ 187.251924] ath11k_pci 0000:03:00.0: failed to send
>>>> WMI_PDEV_SET_PARAM cmd
>>>> [ 187.251933] ath11k_pci 0000:03:00.0: failed to enable dynamic bw:
>>>> -11
>>>>
>>>> So should we generalize above that changing firmware at
>>>> suspend/hibernation time isn't supported. If firmware package is
>>>> updated, does user restarts every time?
>>>
>>> You may want to review how other devices handle this. I can think of
>>> these threads as potential reference
>>>
>>> https://lore.kernel.org/all/
>>> CAPM=9twyvq3EWkwUeoTdMMj76u_sRPmUDHWrzbzEZFQ8eL++BQ@...l.gmail.com/
>>> https://lore.kernel.org/all/20250207012531.621369-1-airlied@gmail.com/
>> They are talking about firmware cache which is not being used in the
>> wireless drivers. In my kernel config, firwmare cache is enabeld. But
>> everytime kernel needs to read the firwamre, it reads from the
>> filesystem.
>>
>> What can be the way forward for this patch? Assuming my previous
>> experiment with changed firmwares across suspend/resume failed, I should
>> remove reuse logic and send again?
>
Not a good approach to justify, but changing of firmware during
suspend/resume doesn't seem to be supported. Maybe let's leave it as it
is and evaluate the current patch in just memory reuse perspective. If
the new memory requested is same as previous, we'll reuse otherwise free
and allocate new memory.
> Perhaps you need to refactor the wireless drivers?
Firmware cache would resolve the changing of firmware. But let's keep is
separate from this current patch. I'm not sure how easy or hard would it
be to enable firmware cache.
If firwamre cache would be present, the current patch would become more
simpler. But without it, the current patch doesn't look that bad.
>
> I'm not convinced your patch is valid. If FW needs to be reloaded due
> to suspend/resume, it seems like the proper thing is to load the same FW
> that was loaded at device boot. Per your testing, loading changed FW
> can cause a failure. Even if it doesn't fail, will the changed firmware
> cause a "breakage" from the user perspective by modifying the device
> behavior?
Ideally, changing of firmware shouldn't cause any issues if driver
handles it correctly. But it seems it not implemented/tested. So its
unsupported.
>
> This does not seem to be a problem that is relevant to all MHI devices,
> so whatever the end solution ends up being, I think that it should not
> be blanket applied to all of MHI.
The drivers already have blankets. This patch was motivated from the
ath12k [1] and ath11k [2]. These driver don't free the memory while
going into suspend and reuse the same memory after resuming. My current
patch has the same essence as these.
[1]
https://lore.kernel.org/all/20240419034034.2842-1-quic_bqiang@quicinc.com/
[2]
https://lore.kernel.org/all/20220506141448.10340-1-quic_akolli@quicinc.com/
--
Regards,
Usama
Powered by blists - more mailing lists