[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <eypd4xigl3yydfj47usazm7ca3kplz5w7bkk7qf6piks4vtaa4@rmecjnlfix66>
Date: Fri, 25 Apr 2025 20:17:14 +0530
From: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
To: Muhammad Usama Anjum <usama.anjum@...labora.com>
Cc: Johannes Berg <johannes@...solutions.net>,
Jeff Johnson <jjohnson@...nel.org>, Jeffrey Hugo <quic_jhugo@...cinc.com>,
Yan Zhen <yanzhen@...o.com>, Youssef Samir <quic_yabdulra@...cinc.com>,
Qiang Yu <quic_qianyu@...cinc.com>, Alex Elder <elder@...nel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Kunwu Chan <chentao@...inos.cn>, kernel@...labora.com,
mhi@...ts.linux.dev, linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-wireless@...r.kernel.org, ath11k@...ts.infradead.org
Subject: Re: [PATCH v2] bus: mhi: host: don't free bhie tables during
suspend/hibernation
On Fri, Apr 25, 2025 at 04:41:43PM +0500, Muhammad Usama Anjum wrote:
> On 4/25/25 1:59 PM, Manivannan Sadhasivam wrote:
> > On Fri, Apr 25, 2025 at 12:42:38PM +0500, Muhammad Usama Anjum wrote:
> >> On 4/25/25 12:32 PM, Manivannan Sadhasivam wrote:
> >>> On Fri, Apr 25, 2025 at 12:14:39PM +0500, Muhammad Usama Anjum wrote:
> >>>> On 4/25/25 12:04 PM, Manivannan Sadhasivam wrote:
> >>>>> On Thu, Apr 10, 2025 at 07:56:54PM +0500, Muhammad Usama Anjum wrote:
> >>>>>> Fix dma_direct_alloc() failure at resume time during bhie_table
> >>>>>> allocation. There is a crash report where at resume time, the memory
> >>>>>> from the dma doesn't get allocated and MHI fails to re-initialize.
> >>>>>> There may be fragmentation of some kind which fails the allocation
> >>>>>> call.
> >>>>>>
> >>>>>
> >>>>> If dma_direct_alloc() fails, then it is a platform limitation/issue. We cannot
> >>>>> workaround that in the device drivers. What is the guarantee that other drivers
> >>>>> will also continue to work? Will you go ahead and patch all of them which
> >>>>> release memory during suspend?
> >>>>>
> >>>>> Please investigate why the allocation fails. Even this is not a device issue, so
> >>>>> we cannot add quirks :/
> >>>> This isn't a platform specific quirk. We are only hitting it because
> >>>> there is high memory pressure during suspend/resume. This dma allocation
> >>>> failure can happen with memory pressure on any device.
> >>>>
> >>>
> >>> Yes.
> >> Thanks for understanding.
> >>
> >>>
> >>>> The purpose of this patch is just to make driver more robust to memory
> >>>> pressure during resume.
> >>>>
> >>>> I'm not sure about MHI. But other drivers already have such patches as
> >>>> dma_direct_alloc() is susceptible to failures when memory pressure is
> >>>> high. This patch was motivated from ath12k [1] and ath11k [2].
> >>>>
> >>>
> >>> Even if we patch the MHI driver, the issue is going to trip some other driver.
> >>> How does the DMA memory goes low during resume? So some other driver is
> >>> consuming more than it did during probe()?
> >> Think it like this. The first probe happens just after boot. Most of the
> >> RAM was empty. Then let's say user launches applications which not only
> >> consume entire RAM but also the Swap. The DMA memory area is the first
> >> ~4GB on x86_64 (if I'm not mistaken). Now at resume time when we want to
> >> allocate memory from dma, it may not be available entirely or because of
> >> fragmentation we cannot allocate that much contiguous memory.
> >>
> >
> > Looks like you have a workload that consumes the limited DMA coherent memory.
> > Most likely the GPU applications I think.
> >
> >> In our testing and real world cases, right now only wifi driver is
> >> misbehaving. Wifi is also very important. So we are hoping to make wifi
> >> driver robust.
> >>
> >
> > Sounds fair. If you want to move forward, please modify the exisiting
> > mhi_power_down_keep_dev() to include this partial unprepare as well:
> >
> > mhi_power_down_unprepare_keep_dev()
> >
> > Since both APIs are anyway going to be used together, I don't see a need to
> > introduce yet another API.
> I've looked at usages of mhi_power_down_keep_dev(). Its getting used by
> ath12k and ath11k both. We would have to look at ath12k as well before
> we can change mhi_power_down_keep_dev(). Unfortunately, I don't have
> device using ath12k at hand.
>
ath12k conversion looks trivial. So please go ahead with this new API conversion
for that driver as well.
- Mani
--
மணிவண்ணன் சதாசிவம்
Powered by blists - more mailing lists