[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <12a66199-3fe2-49d1-994a-84e4e1d2eba9@oss.qualcomm.com>
Date: Wed, 23 Jul 2025 14:50:15 +0800
From: Baochen Qiang <baochen.qiang@....qualcomm.com>
To: Robin Murphy <robin.murphy@....com>, Greg KH
<gregkh@...uxfoundation.org>,
Muhammad Usama Anjum <usama.anjum@...labora.com>
Cc: Matthew Wilcox <willy@...radead.org>,
Jeff Hugo <jeff.hugo@....qualcomm.com>,
Manivannan Sadhasivam <mani@...nel.org>,
Jeff Johnson <jjohnson@...nel.org>,
Marek Szyprowski <m.szyprowski@...sung.com>,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
kernel@...labora.com, Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, iommu@...ts.linux.dev
Subject: Re: Excessive page cache occupies DMA32 memory
On 7/22/2025 6:03 PM, Robin Murphy wrote:
> On 2025-07-22 8:24 am, Greg KH wrote:
>> On Tue, Jul 22, 2025 at 11:05:11AM +0500, Muhammad Usama Anjum wrote:
>>> Adding ath/mhi and dma API developers to the discussion.
>>>
>>> On 7/22/25 10:32 AM, Greg KH wrote:
>>>> On Mon, Jul 21, 2025 at 06:13:10PM +0100, Matthew Wilcox wrote:
>>>>> On Mon, Jul 21, 2025 at 08:03:12PM +0500, Muhammad Usama Anjum wrote:
>>>>>> Hello,
>>>>>>
>>>>>> When 10-12GB our of total 16GB RAM is being used as page cache
>>>>>> (active_file + inactive_file) at suspend time, the drivers fail to allocate
>>>>>> dma memory at resume as dma memory is either occupied by the page cache or
>>>>>> fragmented. Example:
>>>>>>
>>>>>> kworker/u33:5: page allocation failure: order:7, mode:0xc04(GFP_NOIO|GFP_DMA32),
>>>>>> nodemask=(null),cpuset=/,mems_allowed=0
>>>>>
>>>>> Just to be clear, this is not a page cache problem. The driver is asking
>>>>> us to do a 512kB allocation without doing I/O! This is a ridiculous
>>>>> request that should be expected to fail.
>>>>>
>>>>> The solution, whatever it may be, is not related to the page cache.
>>>>> I reject your diagnosis. Almost all of the page cache is clean and
>>>>> could be dropped (as far as I can tell from the output below).
>>>>>
>>>>> Now, I'm not too familiar with how the page allocator chooses to fail
>>>>> this request. Maybe it should be trying harder to drop bits of the page
>>>>> cache. Maybe it should be doing some compaction.
>>> That's very thoughtful. I'll look at the page allocator why isn't it dropping
>>> cache or doing compaction.
>>>
>>>>> I am not inclined to
>>>>> go digging on your behalf, because frankly I'm offended by the suggestion
>>>>> that the page cache is at fault.
>>> I apologize—that wasn't my intention.
>>>
>>>>>
>>>>> Perhaps somebody else will help you, or you can dig into this yourself.
>>>>
>>>> I'm with Matthew, this really looks like a driver bug somehow. If there
>>>> is page cache memory that is "clean", the driver should be able to
>>>> access it just fine if really required.
>>>>
>>>> What exact driver(s) is having this problem? What is the exact error,
>>>> and on what lines of code?
>>> The issue occurs on both ath11k and mhi drivers during resume, when
>>> dma_alloc_coherent(GFP_KERNEL) fails and returns -ENOMEM. This failure has
>>> been observed at multiple points in these drivers.
>>>
>>> For example, in the mhi driver, the failure is triggered when the
>>> MHI's st_worker gets scheduled-in at resume.
>>>
>>> mhi_pm_st_worker()
>>> -> mhi_fw_load_handler()
>>> -> mhi_load_image_bhi()
>>> -> mhi_alloc_bhi_buffer()
>>> -> dma_alloc_coherent(GFP_KERNEL) returns -ENOMEM
>>
>> And what is the exact size you are asking for here?
>> What is the dma ops set to for your system? Are you sure that is
>> working properly for your platform? What platform is this exactly?
>>
>> The driver isn't asking for DMA32 here, so that shouldn't be the issue,
>> so why do you feel it is? Have you tried using the tracing stuff for
>> dma allocations to see exactly what is going on for this failure?
>
> I'm guessing the device has a 32-bit DMA mask, and the allocation ends up in
Yeah, the device is capable of 32 bit coherent DMA only.
> __dma_direct_alloc_pages() such that that adds GFP_DMA32 in order to try to satisfy the
> mask via regular page allocation. How GFP_KERNEL turns into GFP_NOIO, though, given that
> the DMA layer certainly isn't (knowingly) messing with __GFP_IO or __GFP_FS, is more of a
> mystery... I suppose "during resume" is the red flag there - is this worker perhaps trying
> to run too early in some restricted context before the rest of the system has fully woken up?
the worker is running at __resume_early stage.
>
> Thanks,
> Robin.
>
>>
>> I think you need to do a bit more debugging :)
>>
>> thanks,
>>
>> greg k-h
>
Powered by blists - more mailing lists