[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6bcb4883-03d0-88eb-4c42-84fff0a9a141@loongson.cn>
Date: Thu, 15 Sep 2022 09:19:48 +0800
From: Hongchen Zhang <zhanghongchen@...ngson.cn>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Matthew Wilcox <willy@...radead.org>
Subject: Re: [PATCH] mm/vmscan: don't scan adjust too much if current is not
kswapd
Hi Andrew,
On 2022/9/15 am 6:51, Andrew Morton wrote:
> On Wed, 14 Sep 2022 10:33:18 +0800 Hongchen Zhang <zhanghongchen@...ngson.cn> wrote:
>
>> when a process falls into page fault and there is not enough free
>> memory,it will do direct reclaim. At the same time,it is holding
>> mmap_lock.So in case of multi-thread,it should exit from page fault
>> ASAP.
>> When reclaim memory,we do scan adjust between anon and file lru which
>> may cost too much time and trigger hung task for other thread.So for a
>> process which is not kswapd,it should just do a little scan adjust.
>
> Well, that's a pretty nasty bug. Before diving into a possible fix,
> can you please tell us more about how this happens? What sort of
> machine, what sort of workload. Can you suggest why others are not
> experiencing this?
>We got a hung task panic originally by doing ltpstress on a Loongarch
3A5000+71000 machine.Then, we found the same problem on a X86 machine as
following:
[ 3748.453561] INFO: task float_bessel:77920 blocked for more than 120
seconds.
[ 3748.460839] Not tainted 5.15.0-46-generic #49-Ubuntu
[ 3748.466490] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 3748.474618] task:float_bessel state:D stack: 0 pid:77920 ppid:
77327 flags:0x00004002
[ 3748.483358] Call Trace:
[ 3748.485964] <TASK>
[ 3748.488150] __schedule+0x23d/0x590
[ 3748.491804] schedule+0x4e/0xc0
[ 3748.495038] rwsem_down_read_slowpath+0x336/0x390
[ 3748.499886] ? copy_user_enhanced_fast_string+0xe/0x40
[ 3748.505181] down_read+0x43/0xa0
[ 3748.508518] do_user_addr_fault+0x41c/0x670
[ 3748.512799] exc_page_fault+0x77/0x170
[ 3748.516673] asm_exc_page_fault+0x26/0x30
[ 3748.520824] RIP: 0010:copy_user_enhanced_fast_string+0xe/0x40
[ 3748.526764] Code: 89 d1 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 31 c0
0f 01 ca c3 cc cc cc cc 0f 1f 00 0f 01 cb 83 fa 40 0f 82 70 ff ff ff 89
d1 <f3> a4 31 c0 0f 01 ca c3 cc cc cc cc 66 08
[ 3748.546120] RSP: 0018:ffffaa9248fffb90 EFLAGS: 00050206
[ 3748.551495] RAX: 00007f99faa1a010 RBX: ffffaa9248fffd88 RCX:
0000000000000010
[ 3748.558828] RDX: 0000000000001000 RSI: ffff9db397ab8ff0 RDI:
00007f99faa1a000
[ 3748.566160] RBP: ffffaa9248fffbf0 R08: ffffcc2fc2965d80 R09:
0000000000000014
[ 3748.573492] R10: 0000000000000000 R11: 0000000000000014 R12:
0000000000001000
[ 3748.580858] R13: 0000000000001000 R14: 0000000000000000 R15:
ffffaa9248fffd98
[ 3748.588196] ? copy_page_to_iter+0x10e/0x400
[ 3748.592614] filemap_read+0x174/0x3e0
[ 3748.596354] ? ima_file_check+0x6a/0xa0
[ 3748.600301] generic_file_read_iter+0xe5/0x150
[ 3748.604884] ext4_file_read_iter+0x5b/0x190
[ 3748.609164] ? aa_file_perm+0x102/0x250
[ 3748.613125] new_sync_read+0x10d/0x1a0
[ 3748.617009] vfs_read+0x103/0x1a0
[ 3748.620423] ksys_read+0x67/0xf0
[ 3748.623743] __x64_sys_read+0x19/0x20
[ 3748.627511] do_syscall_64+0x59/0xc0
[ 3748.631203] ? syscall_exit_to_user_mode+0x27/0x50
[ 3748.636144] ? do_syscall_64+0x69/0xc0
[ 3748.639992] ? exit_to_user_mode_prepare+0x96/0xb0
[ 3748.644931] ? irqentry_exit_to_user_mode+0x9/0x20
[ 3748.649872] ? irqentry_exit+0x1d/0x30
[ 3748.653737] ? exc_page_fault+0x89/0x170
[ 3748.657795] entry_SYSCALL_64_after_hwframe+0x61/0xcb
[ 3748.663030] RIP: 0033:0x7f9a852989cc
[ 3748.666713] RSP: 002b:00007f9a8497dc90 EFLAGS: 00000246 ORIG_RAX:
0000000000000000
[ 3748.674487] RAX: ffffffffffffffda RBX: 00007f9a8497f5c0 RCX:
00007f9a852989cc
[ 3748.681817] RDX: 0000000000027100 RSI: 00007f99faa18010 RDI:
0000000000000061
[ 3748.689150] RBP: 00007f9a8497dd60 R08: 0000000000000000 R09:
00007f99faa18010
[ 3748.696493] R10: 0000000000000000 R11: 0000000000000246 R12:
00007f99faa18010
[ 3748.703841] R13: 00005605e11c406f R14: 0000000000000001 R15:
0000000000027100
[ 3748.711199] </TASK>
...
...
[ 3750.943278] Kernel panic - not syncing: hung_task: blocked tasks
[ 3750.949399] CPU: 1 PID: 39 Comm: khungtaskd Not tainted
5.15.0-46-generic #49-Ubuntu
[ 3750.957305] Hardware name: LENOVO 90DWCTO1WW/30D9, BIOS M05KT70A
03/09/2017
[ 3750.964410] Call Trace:
[ 3750.966897] <TASK>
[ 3750.969031] show_stack+0x52/0x5c
[ 3750.972409] dump_stack_lvl+0x4a/0x63
[ 3750.976129] dump_stack+0x10/0x16
[ 3750.979491] panic+0x149/0x321
[ 3750.982612] check_hung_uninterruptible_tasks.cold+0x34/0x48
[ 3750.988383] watchdog+0xad/0xb0
[ 3750.991562] ? check_hung_uninterruptible_tasks+0x300/0x300
[ 3750.997285] kthread+0x127/0x150
[ 3751.000587] ? set_kthread_struct+0x50/0x50
[ 3751.004878] ret_from_fork+0x1f/0x30
[ 3751.008527] </TASK>
[ 3751.010794] Kernel Offset: 0x34600000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 3751.034481] ---[ end Kernel panic - not syncing: hung_task: blocked
tasks ]---
The difference with normal ltpstress test is we use a very large swap
partition,so the swap pressure is bigger than normal,and this problem
becomes more likely to occur.
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -3042,11 +3042,17 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
>> nr[lru] = targets[lru] * (100 - percentage) / 100;
>> nr[lru] -= min(nr[lru], nr_scanned);
>>
>> + if (!current_is_kswapd())
>> + nr[lru] = min(nr[lru], nr_to_reclaim);
>> +
>> lru += LRU_ACTIVE;
>> nr_scanned = targets[lru] - nr[lru];
>> nr[lru] = targets[lru] * (100 - percentage) / 100;
>> nr[lru] -= min(nr[lru], nr_scanned);
>>
>> + if (!current_is_kswapd())
>> + nr[lru] = min(nr[lru], nr_to_reclaim);
>> +
>> scan_adjusted = true;
>> }
>> blk_finish_plug(&plug);
>
> It would be better if these additions had code comments explaining why
> they're there. But let's more fully understand the problem before
> altering your patch.
Thanks,
Hongchen Zhang
Powered by blists - more mailing lists