linux-kernel - Excessive page cache occupies DMA32 memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <766ef20e-7569-46f3-aa3c-b576e4bab4c6@collabora.com>
Date: Mon, 21 Jul 2025 20:03:12 +0500
From: Muhammad Usama Anjum <usama.anjum@...labora.com>
To: "Matthew Wilcox (Oracle)" <willy@...radead.org>
Cc: linux-kernel@...r.kernel.org, gregkh@...uxfoundation.org,
 usama.anjum@...labora.com, Andrew Morton <akpm@...ux-foundation.org>,
 kernel@...labora.com, linux-mm@...ck.org, linux-fsdevel@...r.kernel.org
Subject: Excessive page cache occupies DMA32 memory

Hello,

When 10-12GB our of total 16GB RAM is being used as page cache
(active_file + inactive_file) at suspend time, the drivers fail to allocate
dma memory at resume as dma memory is either occupied by the page cache or
fragmented. Example:

kworker/u33:5: page allocation failure: order:7, mode:0xc04(GFP_NOIO|GFP_DMA32), nodemask=(null),cpuset=/,mems_allowed=0
CPU: 1 UID: 0 PID: 7693 Comm: kworker/u33:5 Not tainted 6.11.11-valve17-1-neptune-611-g027868a0ac03 #1 3843143b92e9da0fa2d3d5f21f51beaed15c7d59
Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
Call Trace:
 <TASK>
 dump_stack_lvl+0x4e/0x70
 warn_alloc+0x164/0x190
 ? srso_return_thunk+0x5/0x5f
 ? __alloc_pages_direct_compact+0xaf/0x360
 __alloc_pages_slowpath.constprop.0+0xc75/0xd70
 __alloc_pages_noprof+0x321/0x350
 __dma_direct_alloc_pages.isra.0+0x14a/0x290
 dma_direct_alloc+0x70/0x270
 mhi_fw_load_handler+0x126/0x340 [mhi a96cb91daba500cc77f86bad60c1f332dc3babdf]
 mhi_pm_st_worker+0x5e8/0xac0 [mhi a96cb91daba500cc77f86bad60c1f332dc3babdf]
 ? srso_return_thunk+0x5/0x5f
 process_one_work+0x17e/0x330
 worker_thread+0x2ce/0x3f0
 ? __pfx_worker_thread+0x10/0x10
 kthread+0xd2/0x100
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x34/0x50
 ? __pfx_kthread+0x10/0x10
 ret_from_fork_asm+0x1a/0x30
 </TASK>
Mem-Info:
active_anon:513809 inactive_anon:152 isolated_anon:0
active_file:359315 inactive_file:2487001 isolated_file:0
unevictable:637 dirty:19 writeback:0
slab_reclaimable:160391 slab_unreclaimable:39729
mapped:175836 shmem:51039 pagetables:4415
sec_pagetables:0 bounce:0
kernel_misc_reclaimable:0
free:125666 free_pcp:0 free_cma:0
Node 0 active_anon:2055236kB inactive_anon:608kB active_file:1437260kB inactive_file:9948004kB unevictable:2548kB isolated(anon):0kB isolated(file):0kB mapped:703344kB dirty:76kB writeback:0kB shmem:204156kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:495616kB writeback_tmp:0kB kernel_stack:9440kB pagetables:17660kB sec_pagetables:0kB all_unreclaimable? no
Node 0 DMA free:68kB boost:0kB min:68kB low:84kB high:100kB reserved_highatomic:0KB active_anon:8kB inactive_anon:0kB active_file:0kB inactive_file:13232kB unevictable:0kB writepending:0kB present:15992kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
lowmem_reserve[]: 0 1808 14772 0 0
Node 0 DMA32 free:9796kB boost:0kB min:8264kB low:10328kB high:12392kB reserved_highatomic:0KB active_anon:14148kB inactive_anon:88kB active_file:128kB inactive_file:1757192kB unevictable:0kB writepending:0kB present:1935736kB managed:1867440kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
lowmem_reserve[]: 0 0 12964 0 0
Node 0 DMA: 5*4kB (U) 0*8kB 1*16kB (U) 1*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 68kB
Node 0 DMA32: 103*4kB (UME) 52*8kB (UME) 43*16kB (UME) 58*32kB (UME) 35*64kB (UME) 23*128kB (UME) 5*256kB (ME) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 9836kB
Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
2897795 total pagecache pages
0 pages in swap cache
Free swap  = 8630724kB
Total swap = 8630776kB
3892604 pages RAM
0 pages HighMem/MovableOnly
101363 pages reserved
0 pages cma reserved
0 pages hwpoisoned

As you can see above, the ~11 GB of page cache has consumed DMA32 pages,
leaving only 9.8MB free but heavily fragmented with no contiguous blocks
≥512KB. Its hard to reproduce by a test. We have received several reports
for v6.11 kernel. As we don't have reliable reproducer yet, we cannot test
if other kernels are also affected.

Current mitigations are:
1 Pre-allocate buffer in drivers and don't free them even if they are only
  used during during initialization at boot and resume. But it wastes memory
  and unacceptable even if its just 2-4MB.
2 Drop caches at suspend. But it causes latency during suspension and
  slowness on resume. There is no way to drop only couple of GB of page
  cache as that wouldn't take long at suspend time.

Greg dislikes 1 and rejects it which is understandable. [1]:
> It should be reclaiming this, as it's just cache, not really used
> memory.

Would it be reasonable to add a mechanism to limit page cache growth?
I think, there should be some watermark or similar by which we can
indicate to page cache to don't go above it. Or at suspend, drop only
a part of of the page cache and not the entire page cache. What other
options are available? 

[1] https://lore.kernel.org/all/2025071722-panther-legwarmer-d2be@gregkh 

Thanks,
Muhammad Usama Anjum