[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1561411839.5154.60.camel@lca.pw>
Date: Mon, 24 Jun 2019 17:30:39 -0400
From: Qian Cai <cai@....pw>
To: Will Deacon <will@...nel.org>
Cc: Anshuman Khandual <anshuman.khandual@....com>,
Catalin Marinas <catalin.marinas@....com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
Mike Kravetz <mike.kravetz@...cle.com>
Subject: Re: LTP hugemmap05 test case failure on arm64 with linux-next
(next-20190613)
So the problem is that ipcget_public() has held the semaphore "ids->rwsem" for
too long seems unnecessarily and then goes to sleep sometimes due to direct
reclaim (other times LTP hugemmap05 [1] has hugetlb_file_setup() returns
-ENOMEM),
[ 788.765739][ T1315] INFO: task hugemmap05:5001 can't die for more than 122
seconds.
[ 788.773512][ T1315] hugemmap05 R running task 25600 5001 1
0x0000000d
[ 788.781348][ T1315] Call trace:
[ 788.784536][ T1315] __switch_to+0x2e0/0x37c
[ 788.788848][ T1315] try_to_free_pages+0x614/0x934
[ 788.793679][ T1315] __alloc_pages_nodemask+0xe88/0x1d60
[ 788.799030][ T1315] alloc_fresh_huge_page+0x16c/0x588
[ 788.804206][ T1315] alloc_surplus_huge_page+0x9c/0x278
[ 788.809468][ T1315] hugetlb_acct_memory+0x114/0x5c4
[ 788.814469][ T1315] hugetlb_reserve_pages+0x170/0x2b0
[ 788.819662][ T1315] hugetlb_file_setup+0x26c/0x3a8
[ 788.824600][ T1315] newseg+0x220/0x63c
[ 788.828490][ T1315] ipcget+0x570/0x674
[ 788.832377][ T1315] ksys_shmget+0x90/0xc4
[ 788.836525][ T1315] __arm64_sys_shmget+0x54/0x88
[ 788.841282][ T1315] el0_svc_handler+0x19c/0x26c
[ 788.845952][ T1315] el0_svc+0x8/0xc
and then all other processes are waiting on the semaphore causes lock
contentions,
[ 788.849583][ T1315] INFO: task hugemmap05:5027 blocked for more than 122
seconds.
[ 788.857119][ T1315] Tainted: G W 5.2.0-rc6-next-20190624
#2
[ 788.864566][ T1315] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 788.873139][ T1315] hugemmap05 D26960 5027 5026 0x00000000
[ 788.879395][ T1315] Call trace:
[ 788.882576][ T1315] __switch_to+0x2e0/0x37c
[ 788.886901][ T1315] __schedule+0xb74/0xf0c
[ 788.891136][ T1315] schedule+0x60/0x168
[ 788.895097][ T1315] rwsem_down_write_slowpath+0x5a0/0x8c8
[ 788.900653][ T1315] down_write+0xc0/0xc4
[ 788.904715][ T1315] ipcget+0x74/0x674
[ 788.908516][ T1315] ksys_shmget+0x90/0xc4
[ 788.912664][ T1315] __arm64_sys_shmget+0x54/0x88
[ 788.917420][ T1315] el0_svc_handler+0x19c/0x26c
[ 788.922088][ T1315] el0_svc+0x8/0xc
Ideally, it seems only ipc_findkey() and newseg() in this path needs to hold the
semaphore to protect concurrency access, so it could just be converted to a
spinlock instead.
[1] ./hugemmap05 -s -m
https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/huget
lb/hugemmap/hugemmap05.c
Powered by blists - more mailing lists