[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <37ea52d5-11de-49c7-a676-ec47cca7f91b@oss.qualcomm.com>
Date: Fri, 11 Apr 2025 10:10:06 -0600
From: Jeff Hugo <jeff.hugo@....qualcomm.com>
To: Muhammad Usama Anjum <usama.anjum@...labora.com>,
Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>,
Johannes Berg <johannes@...solutions.net>,
Jeff Johnson
<jjohnson@...nel.org>, Yan Zhen <yanzhen@...o.com>,
Youssef Samir <quic_yabdulra@...cinc.com>,
Qiang Yu
<quic_qianyu@...cinc.com>, Alex Elder <elder@...nel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Kunwu Chan <chentao@...inos.cn>
Cc: kernel@...labora.com, mhi@...ts.linux.dev, linux-arm-msm@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-wireless@...r.kernel.org,
ath11k@...ts.infradead.org
Subject: Re: [PATCH v2] bus: mhi: host: don't free bhie tables during
suspend/hibernation
On 4/10/2025 8:56 AM, Muhammad Usama Anjum wrote:
> Fix dma_direct_alloc() failure at resume time during bhie_table
> allocation. There is a crash report where at resume time, the memory
> from the dma doesn't get allocated and MHI fails to re-initialize.
> There may be fragmentation of some kind which fails the allocation
> call.
>
> To fix it, don't free the memory at power down during suspend /
> hibernation. Instead, use the same allocated memory again after every
> resume / hibernation. This patch has been tested with resume and
> hibernation both.
>
> The rddm is of constant size for a given hardware. While the fbc_image
> size depends on the firmware. If the firmware changes, we'll free and
> allocate new memory for it.
>
> Here are the crash logs:
>
> [ 3029.338587] mhi mhi0: Requested to power ON
> [ 3029.338621] mhi mhi0: Power on setup success
> [ 3029.668654] kworker/u33:8: page allocation failure: order:7, mode:0xc04(GFP_NOIO|GFP_DMA32), nodemask=(null),cpuset=/,mems_allowed=0
> [ 3029.668682] CPU: 4 UID: 0 PID: 2744 Comm: kworker/u33:8 Not tainted 6.11.11-valve10-1-neptune-611-gb69e902b4338 #1ed779c892334112fb968aaa3facf9686b5ff0bd7
> [ 3029.668690] Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
> [ 3029.668694] Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
> [ 3029.668717] Call Trace:
> [ 3029.668722] <TASK>
> [ 3029.668728] dump_stack_lvl+0x4e/0x70
> [ 3029.668738] warn_alloc+0x164/0x190
> [ 3029.668747] ? srso_return_thunk+0x5/0x5f
> [ 3029.668754] ? __alloc_pages_direct_compact+0xaf/0x360
> [ 3029.668761] __alloc_pages_slowpath.constprop.0+0xc75/0xd70
> [ 3029.668774] __alloc_pages_noprof+0x321/0x350
> [ 3029.668782] __dma_direct_alloc_pages.isra.0+0x14a/0x290
> [ 3029.668790] dma_direct_alloc+0x70/0x270
> [ 3029.668796] mhi_alloc_bhie_table+0xe8/0x190 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
> [ 3029.668814] mhi_fw_load_handler+0x1bc/0x310 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
> [ 3029.668830] mhi_pm_st_worker+0x5c8/0xaa0 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
> [ 3029.668844] ? srso_return_thunk+0x5/0x5f
> [ 3029.668853] process_one_work+0x17e/0x330
> [ 3029.668861] worker_thread+0x2ce/0x3f0
> [ 3029.668868] ? __pfx_worker_thread+0x10/0x10
> [ 3029.668873] kthread+0xd2/0x100
> [ 3029.668879] ? __pfx_kthread+0x10/0x10
> [ 3029.668885] ret_from_fork+0x34/0x50
> [ 3029.668892] ? __pfx_kthread+0x10/0x10
> [ 3029.668898] ret_from_fork_asm+0x1a/0x30
> [ 3029.668910] </TASK>
>
> Tested-on: QCNFA765 WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6
>
> Signed-off-by: Muhammad Usama Anjum <usama.anjum@...labora.com>
> ---
> Changes sice v1:
> - Don't free bhie tables during suspend/hibernation only
> - Handle fbc_image changed size correctly
> - Remove fbc_image getting set to NULL in *free_bhie_table()
> ---
> drivers/bus/mhi/host/boot.c | 15 +++++++++++----
> drivers/bus/mhi/host/init.c | 13 ++++++++++---
> drivers/net/wireless/ath/ath11k/mhi.c | 9 +++++----
> include/linux/mhi.h | 7 +++++++
> 4 files changed, 33 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/bus/mhi/host/boot.c b/drivers/bus/mhi/host/boot.c
> index 9dcc7184817d5..0df26100c8f9c 100644
> --- a/drivers/bus/mhi/host/boot.c
> +++ b/drivers/bus/mhi/host/boot.c
> @@ -487,10 +487,17 @@ void mhi_fw_load_handler(struct mhi_controller *mhi_cntrl)
> * device transitioning into MHI READY state
> */
> if (mhi_cntrl->fbc_download) {
> - ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image, fw_sz);
> - if (ret) {
> - release_firmware(firmware);
> - goto error_fw_load;
> + if (mhi_cntrl->fbc_image && fw_sz != mhi_cntrl->prev_fw_sz) {
> + mhi_free_bhie_table(mhi_cntrl, mhi_cntrl->fbc_image);
> + mhi_cntrl->fbc_image = NULL;
> + }
> + if (!mhi_cntrl->fbc_image) {
> + ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image, fw_sz);
> + if (ret) {
> + release_firmware(firmware);
> + goto error_fw_load;
> + }
> + mhi_cntrl->prev_fw_sz = fw_sz;
This seems confusing. Why do we care about the previous fw size when we
care about the allocated bhie table size? Also, if the fw size is
smaller than the allocated table size it looks like we'll do a
free/alloc, when it seems like we could jsut use the memory we already have.
> }
>
> /* Load the firmware into BHIE vec table */
> diff --git a/include/linux/mhi.h b/include/linux/mhi.h
> index 059dc94d20bb6..65a47c712b3a0 100644
> --- a/include/linux/mhi.h
> +++ b/include/linux/mhi.h
> @@ -382,6 +382,7 @@ struct mhi_controller {
> const char *fw_image;
> const u8 *fw_data;
> size_t fw_sz;
> + size_t prev_fw_sz;
No documentation?
> const char *edl_image;
> size_t rddm_size;
> size_t sbl_size;
> @@ -662,6 +663,12 @@ void mhi_power_down_keep_dev(struct mhi_controller *mhi_cntrl, bool graceful);
> */
> void mhi_unprepare_after_power_down(struct mhi_controller *mhi_cntrl);
>
> +/**
> + * mhi_partial_unprepare_after_power_down - Free any allocated memory after power down partially
This looks like it exceeds 80 char.
Also what is a "power down partially"?
Powered by blists - more mailing lists