[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20241210094617.66152-1-husong@kylinos.cn>
Date: Tue, 10 Dec 2024 17:46:17 +0800
From: Hu Song <husong@...inos.cn>
To: gregkh@...uxfoundation.org,
viro@...iv.linux.org.uk
Cc: linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org,
husong@...inos.cn,
liuye@...inos.cn
Subject: [PATCH 4.19.y] fs/iomap: use consistent gfp flags during xfs readpages
In low memory situations(Specifically in docker),xfs_vm_readpages
path might declare memcg oom during fs pagefault and kill applications.
This patch extends the commit 8a5c743e308d ("mm, memcg: use consistent
gfp flags during readahead") to include XFS by modifying its readahead
path to use readahead_gfp_mask.Specifically, the gfp_mask logic in
xfs_vm_readpages and related functions is now aligned with
readahead_gfp_mask to ensure consistent behavior during readahead.
This prevents potential OOMs caused by discrepancies in gfp_mask handling.
Test Results:
run docker:docker container run --name wget.100m.ky -d
--memory 104857600 --memory-swap 104857600;
docker : wget http://172.17.0.1/testfile(2G largely file)
Before the fix:
printk:try_to_free_mem_cgroup_pages's parameters: gfp_mask=0x62004a
(GFP_NOFS|__GFP_HIGHMEM |__GFP_HARDWALL|__GFP_MOVABLE)
and return value:nr_reclaimed: 0
[ 153.390196] CPU: 1 PID: 5405 Comm: wget Kdump: loaded Not tainted 4.19.90-25+ #24
[ 153.390197] Hardware name: American Megatrends Inc. To be filled by O.E.M./To be filled by O.E.M., BIOS ITSW3001 09/14/2020
[ 153.390197] Call Trace:
[ 153.390199] dump_stack+0x64/0x88
[ 153.390200] try_to_free_mem_cgroup_pages.cold+0x30/0x3e
[ 153.390201] try_charge+0x2d9/0x7a0
[ 153.390202] ? memcg_check_events+0xdd/0x250
[ 153.390203] mem_cgroup_try_charge+0x8b/0x180
[ 153.390204] __add_to_page_cache_locked+0x64/0x240
[ 153.390205] add_to_page_cache_lru+0x48/0xe0
[ 153.390206] iomap_readpages_actor+0x10e/0x240
[ 153.390207] iomap_apply+0xc3/0x130
[ 153.390208] ? iomap_write_begin.constprop.0+0x310/0x310
[ 153.390209] iomap_readpages+0xa4/0x190
[ 153.390210] ? iomap_write_begin.constprop.0+0x310/0x310
[ 153.390211] read_pages.isra.0+0x72/0x190
[ 153.390212] __do_page_cache_readahead+0x1b2/0x1d0
[ 153.390214] filemap_fault+0x2d6/0x570
[ 153.390235] __xfs_filemap_fault+0x6b/0x200 [xfs]
[ 153.390236] __do_fault+0x38/0x120
[ 153.390237] do_fault+0x119/0x3e0
[ 153.390238] __handle_mm_fault+0x455/0x5d0
[ 153.390239] handle_mm_fault+0x90/0x1b0
[ 153.390240] __do_page_fault+0x2ea/0x540
[ 153.390242] do_page_fault+0x33/0x120
[ 153.390243] ? page_fault+0x8/0x30
[ 153.390243] page_fault+0x1e/0x30
[ 153.390244] RIP: 0033:0x7f5404794514
[ 153.390246] Code: Bad RIP value.
[ 153.390246] RSP: 002b:00007fff244f0728 EFLAGS: 00010246
[ 153.390246] RAX: 0000000000001000 RBX: 0000000000001000 RCX: 00007f5404794514
[ 153.390247] RDX: 0000000000001000 RSI: 000055ef7f87e640 RDI: 0000000000000004
[ 153.390247] RBP: 000055ef7f87e640 R08: 0000000000000000 R09: 000055ef7f87e670
[ 153.390248] R10: 000055ef7f87e620 R11: 0000000000000246 R12: 000055ef7f879d80
[ 153.390248] R13: 0000000000001000 R14: 00007f540485d7c0 R15: 0000000000001000
[ 153.390257] wget invoked oom-killer: gfp_mask=0x600040(GFP_NOFS), nodemask=(null), order=0, oom_score_adj=0
[ 153.390257] wget cpuset=bae816dd30bd6e193684d5580f57fd54df29c0a695dec5b7606931d248c18dd2 mems_allowed=0
wget downloads a 2G file and oom kills the process almost every time
After the fix:
printk:try_to_free_mem_cgroup_pages's parameters: gfp_mask=0x62124a
(GFP_NOFS|__GFP_HIGHMEM|__GFP_NOWARN|__GFP_NORETRY|
__GFP_HARDWALL|__GFP_MOVABLE)
and return value: nr_reclaimed: 55
[ 196.970857] CPU: 9 PID: 5326 Comm: wget Kdump: loaded Not tainted 4.19.90-25+ #23
[ 196.970858] Hardware name: American Megatrends Inc. To be filled by O.E.M./To be filled by O.E.M., BIOS ITSW3001 09/14/2020
[ 196.970858] Call Trace:
[ 196.970860] dump_stack+0x64/0x88
[ 196.970861] try_to_free_mem_cgroup_pages.cold+0x30/0x3e
[ 196.970862] try_charge+0x2d9/0x7a0
[ 196.970863] ? memcg_check_events+0xdd/0x250
[ 196.970865] mem_cgroup_try_charge+0x8b/0x180
[ 196.970865] __add_to_page_cache_locked+0x64/0x240
[ 196.970866] add_to_page_cache_lru+0x48/0xe0
[ 196.970868] iomap_readpages_actor+0x125/0x250
[ 196.970869] iomap_apply+0xc3/0x130
[ 196.970870] ? iomap_write_begin.constprop.0+0x310/0x310
[ 196.970871] iomap_readpages+0xa4/0x190
[ 196.970872] ? iomap_write_begin.constprop.0+0x310/0x310
[ 196.970873] read_pages.isra.0+0x72/0x190
[ 196.970875] __do_page_cache_readahead+0x160/0x1d0
[ 196.970876] filemap_fault+0x2d6/0x570
[ 196.970897] __xfs_filemap_fault+0x6b/0x200 [xfs]
[ 196.970899] __do_fault+0x38/0x120
[ 196.970900] do_fault+0x119/0x3e0
[ 196.970901] __handle_mm_fault+0x455/0x5d0
[ 196.970903] handle_mm_fault+0x90/0x1b0
[ 196.970905] __do_page_fault+0x2ea/0x540
[ 196.970906] do_page_fault+0x33/0x120
[ 196.970907] ? page_fault+0x8/0x30
[ 196.970908] page_fault+0x1e/0x30
[ 196.970909] RIP: 0033:0x7fed5d34b340
[ 196.970911] Code: Bad RIP value.
[ 196.970912] RSP: 002b:00007ffcf231fd68 EFLAGS: 00010246
[ 196.970913] RAX: 0000000000000000 RBX: 000055f860649030 RCX: 00000000061a9000
[ 196.970913] RDX: 000055f860664980 RSI: 0000000000000000 RDI: 000055f860649030
[ 196.970913] RBP: 000000000000003b R08: 7fffffffffffffff R09: 7ffffffff9e58fff
[ 196.970914] R10: 000055f860667620 R11: 0000000000000246 R12: 00000000061a9000
[ 196.970914] R13: 0000000000000000 R14: 000055f860664b50 R15: 000055f860664980
wget downloads a 2G file and is tested 500 times without being killed
Fixes: 8a5c743e308d ("mm, memcg: use consistent gfp flags during readahead")
Signed-off-by: Hu Song <husong@...inos.cn>
---
fs/iomap.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/iomap.c b/fs/iomap.c
index 04e82b6bd9bf..a34e4ec874f0 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -424,6 +424,7 @@ static struct page *
iomap_next_page(struct inode *inode, struct list_head *pages, loff_t pos,
loff_t length, loff_t *done)
{
+ gfp_t gfp_mask = readahead_gfp_mask(inode->i_mapping);
while (!list_empty(pages)) {
struct page *page = lru_to_page(pages);
@@ -432,7 +433,7 @@ iomap_next_page(struct inode *inode, struct list_head *pages, loff_t pos,
list_del(&page->lru);
if (!add_to_page_cache_lru(page, inode->i_mapping, page->index,
- GFP_NOFS))
+ gfp_mask | GFP_NOFS))
return page;
/*
--
2.25.1
Powered by blists - more mailing lists