linux-kernel - Re: mlockall(MCL_CURRENT) blocking infinitely

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191025114633.GE17610@dhcp22.suse.cz>
Date:   Fri, 25 Oct 2019 13:46:33 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     snazy@...zy.de
Cc:     Randy Dunlap <rdunlap@...radead.org>, linux-kernel@...r.kernel.org,
        Linux MM <linux-mm@...ck.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "Potyra, Stefan" <Stefan.Potyra@...ktrobit.com>
Subject: Re: mlockall(MCL_CURRENT) blocking infinitely

On Fri 25-10-19 13:02:23, Robert Stupp wrote:
> On Fri, 2019-10-25 at 11:21 +0200, Michal Hocko wrote:
> > On Thu 24-10-19 16:34:46, Randy Dunlap wrote:
> > > [adding linux-mm + people]
> > >
> > > On 10/24/19 12:36 AM, Robert Stupp wrote:
> > > > Hi guys,
> > > >
> > > > I've got an issue with `mlockall(MCL_CURRENT)` after upgrading
> > > > Ubuntu 19.04 to 19.10 - i.e. kernel version change from 5.0.x to
> > > > 5.3.x.
> > > >
> > > > The following simple program hangs forever with one CPU running
> > > > at 100% (kernel):
> >
> > Can you capture everal snapshots of proc/$(pidof $YOURTASK)/stack
> > while
> > this is happening?
> 
> Sure,
> 
> Approach:
> - one shell running
>   while true; do cat /proc/$(pidof test)/stack; done
> - starting ./test in another shell + ctrl-c quite some times
> 
> Vast majority of all ./test invocations return an empty 'stack' file.
> Some tries, maybe 1 out of 20, returned these snapshots.
> Was running 5.3.7 for this test.
> 
> 
> [<0>] __handle_mm_fault+0x4c5/0x7a0
> [<0>] handle_mm_fault+0xca/0x1f0
> [<0>] __get_user_pages+0x230/0x770
> [<0>] populate_vma_page_range+0x74/0x80
> [<0>] __mm_populate+0xb1/0x150
> [<0>] __x64_sys_mlockall+0x11c/0x190
> [<0>] do_syscall_64+0x5a/0x130
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [<0>] __handle_mm_fault+0x4c5/0x7a0
> [<0>] handle_mm_fault+0xca/0x1f0
> [<0>] __get_user_pages+0x230/0x770
> [<0>] populate_vma_page_range+0x74/0x80
> [<0>] __mm_populate+0xb1/0x150
> [<0>] __x64_sys_mlockall+0x11c/0x190
> [<0>] do_syscall_64+0x5a/0x130
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> 
> [<0>] __handle_mm_fault+0x4c5/0x7a0
> [<0>] handle_mm_fault+0xca/0x1f0
> [<0>] __get_user_pages+0x230/0x770
> [<0>] populate_vma_page_range+0x74/0x80
> [<0>] __mm_populate+0xb1/0x150
> [<0>] __x64_sys_mlockall+0x11c/0x190
> [<0>] do_syscall_64+0x5a/0x130
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> 
> [<0>] __do_fault+0x3c/0x130
> [<0>] do_fault+0x248/0x640
> [<0>] __handle_mm_fault+0x4c5/0x7a0
> [<0>] handle_mm_fault+0xca/0x1f0
> [<0>] __get_user_pages+0x230/0x770
> [<0>] populate_vma_page_range+0x74/0x80
> [<0>] __mm_populate+0xb1/0x150
> [<0>] __x64_sys_mlockall+0x11c/0x190
> [<0>] do_syscall_64+0x5a/0x130
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 

This is expected.

> // doubt this one is relevant
> [<0>] __wake_up_common_lock+0x7c/0xc0
> [<0>] __wake_up_sync_key+0x1e/0x30
> [<0>] __wake_up_parent+0x26/0x30
> [<0>] do_notify_parent+0x1cc/0x280
> [<0>] do_exit+0x703/0xaf0
> [<0>] do_group_exit+0x47/0xb0
> [<0>] get_signal+0x165/0x880
> [<0>] do_signal+0x34/0x280
> [<0>] exit_to_usermode_loop+0xbf/0x160
> [<0>] do_syscall_64+0x10f/0x130
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Hmm, this means that the task has exited so how come there are 
other syscalls happening. Are you sure you are collecting stacks for the
correct task?
-- 
Michal Hocko
SUSE Labs