linux-kernel - Re: [PATCH v4] bus: mhi: host: don't free bhie tables during suspend/hibernation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <951203c6-44a6-4fa9-afad-6ce3973774ae@collabora.com>
Date: Wed, 14 May 2025 12:17:41 +0500
From: Muhammad Usama Anjum <usama.anjum@...labora.com>
To: Jeff Hugo <jeff.hugo@....qualcomm.com>,
 Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>,
 Jeff Johnson <jjohnson@...nel.org>, Youssef Samir
 <quic_yabdulra@...cinc.com>, Matthew Leung <quic_mattleun@...cinc.com>,
 Yan Zhen <yanzhen@...o.com>, Greg Kroah-Hartman
 <gregkh@...uxfoundation.org>,
 Jacek Lawrynowicz <jacek.lawrynowicz@...ux.intel.com>,
 Kunwu Chan <chentao@...inos.cn>, Troy Hanson <quic_thanson@...cinc.com>,
 "Dr. David Alan Gilbert" <linux@...blig.org>
Cc: usama.anjum@...labora.com, kernel@...labora.com,
 sebastian.reichel@...labora.com, Carl Vanderlip <quic_carlv@...cinc.com>,
 Alex Elder <elder@...nel.org>, mhi@...ts.linux.dev,
 linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-wireless@...r.kernel.org, ath11k@...ts.infradead.org,
 ath12k@...ts.infradead.org
Subject: Re: [PATCH v4] bus: mhi: host: don't free bhie tables during
 suspend/hibernation

On 5/13/25 8:16 PM, Jeff Hugo wrote:
> On 5/13/2025 12:44 AM, Muhammad Usama Anjum wrote:
>> On 5/12/25 11:46 PM, Jeff Hugo wrote:
>>> On 5/6/2025 8:49 AM, Muhammad Usama Anjum wrote:
>>>> Fix dma_direct_alloc() failure at resume time during bhie_table
>>>> allocation because of memory pressure. There is a report where at
>>>> resume time, the memory from the dma doesn't get allocated and MHI
>>>> fails to re-initialize.
>>>>
>>>> To fix it, don't free the memory at power down during suspend /
>>>> hibernation. Instead, use the same allocated memory again after every
>>>> resume / hibernation. This patch has been tested with resume and
>>>> hibernation both.
>>>>
>>>> The rddm is of constant size for a given hardware. While the fbc_image
>>>> size depends on the firmware. If the firmware changes, we'll free and
>>>> allocate new memory for it.
>>>
>>> Why is it valid to load new firmware as a result of suspend?  I don't
>>> users would expect that.
>> I'm not sure its valid or not. Like other users, I also don't expect
>> that firmware would get changed. It doesn't seem to be tested and hence
>> supported case.
>>
>> But other drivers have code which have implementation like this. I'd
>> mentioned previously that this patch was motivated from the ath12k [1]
>> and ath11k [2] patches. They don't free the memory and reuse the same
>> memory if new size is same.
> 
> It feels like this justification needs to be detailed in the commit
> text. I suspect at some point we'll have another MHI device where the FW
> will need to be cached.
Okay. I'll add this information to the commit message. Currently I've
not seen firmware caching being used other than graphics driver.

> 
>>>> diff --git a/drivers/bus/mhi/host/boot.c b/drivers/bus/mhi/host/boot.c
>>>> index efa3b6dddf4d2..bc8459798bbee 100644
>>>> --- a/drivers/bus/mhi/host/boot.c
>>>> +++ b/drivers/bus/mhi/host/boot.c
>>>> @@ -584,10 +584,17 @@ void mhi_fw_load_handler(struct mhi_controller
>>>> *mhi_cntrl)
>>>>         * device transitioning into MHI READY state
>>>>         */
>>>>        if (fw_load_type == MHI_FW_LOAD_FBC) {
>>>
>>> Why is this FBC specific?
>> It seems we allocate fbc_image only when firmware load type is
>> FW_LOAD_FBC. I'm just optimizing the buffer allocation here.
> 
> We alloc bhie tables in non FBC usecases. Is this somehow an FBC
> specific issue? Perhaps you could clarify the limits of this solution in
> the commit text?
Okay. I'll add information that we are optimizing the bhie allocations.
It has nothing to do with firmware type. I've found only 2 bhie
allocations; fbc_image and rddm_image. So we are optimizing those.

> 
>>
>>>
>>>> -        ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image,
>>>> fw_sz);
>>>> -        if (ret) {
>>>> -            release_firmware(firmware);
>>>> -            goto error_fw_load;
>>>> +        if (mhi_cntrl->fbc_image && fw_sz != mhi_cntrl->prev_fw_sz) {
>>>> +            mhi_free_bhie_table(mhi_cntrl, mhi_cntrl->fbc_image);
>>>> +            mhi_cntrl->fbc_image = NULL;
>>>> +        }
>>>> +        if (!mhi_cntrl->fbc_image) {
>>>> +            ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl-
>>>>> fbc_image, fw_sz);
>>>> +            if (ret) {
>>>> +                release_firmware(firmware);
>>>> +                goto error_fw_load;
>>>> +            }
>>>> +            mhi_cntrl->prev_fw_sz = fw_sz;
>>>>            }
>>>>              /* Load the firmware into BHIE vec table */
>>>> diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c
>>>> index e6c3ff62bab1d..107d71b4cc51a 100644
>>>> --- a/drivers/bus/mhi/host/pm.c
>>>> +++ b/drivers/bus/mhi/host/pm.c
>>>> @@ -1259,10 +1259,19 @@ void mhi_power_down(struct mhi_controller
>>>> *mhi_cntrl, bool graceful)
>>>>    }
>>>>    EXPORT_SYMBOL_GPL(mhi_power_down);
>>>>    +static void __mhi_power_down_unprepare_keep_dev(struct
>>>> mhi_controller *mhi_cntrl)
>>>> +{
>>>> +    mhi_cntrl->bhi = NULL;
>>>> +    mhi_cntrl->bhie = NULL;
>>>
>>> Why?
>> This function is shorter version of mhi_unprepare_after_power_down(). As
>> we need different code path in case of suspend/hibernation case, I was
>> adding a new API which Mani asked me remove and consolidate into
>> mhi_power_down_keep_dev() instead. So this static function has been
>> added. [3]
> 
> I don't understand the need to zero these out.  Also, if you are copying
> part of the functionality of mhi_unprepare_after_power_down(), shouldn't
> that functionality be moved into your new API to eliminate duplication?
This how the cleanup works mhi_unprepare_after_power_down(). Yeah, it
makes sense to use this function in mhi_unprepare_after_power_down().

Sending next version soon.
> 


-- 
Regards,
Usama