linux-kernel - Re: mlockall(MCL_CURRENT) blocking infinitely

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <9072e55e97f0c4f3b286eb57639c4e9bb4223dfc.camel@gmx.de>
Date:   Wed, 06 Nov 2019 11:25:54 +0100
From:   Robert Stupp <snazy@....de>
To:     Johannes Weiner <hannes@...xchg.org>,
        Vlastimil Babka <vbabka@...e.cz>
Cc:     Michal Hocko <mhocko@...nel.org>,
        Josef Bacik <josef@...icpanda.com>, Jan Kara <jack@...e.cz>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Randy Dunlap <rdunlap@...radead.org>,
        linux-kernel@...r.kernel.org, Linux MM <linux-mm@...ck.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "Potyra, Stefan" <Stefan.Potyra@...ktrobit.com>
Subject: Re: mlockall(MCL_CURRENT) blocking infinitely

Here's one more dmesg output with more information captured in
__get_user_pages() as well. It basically confirms that
handle_mm_fault() returns VM_FAULT_RETRY.

I'm not sure where and what to change ("fix with a FOLL_TRIED
somewhere") to make it work. My (uneducated) impression is, that only
__get_user_pages() needs to be changed - but I might be wrong.

On Tue, 2019-11-05 at 21:05 +0100, Robert Stupp wrote:
> On Tue, 2019-11-05 at 13:22 -0500, Johannes Weiner wrote:
> > Judging from Robert's stack captures, the task is not hung but
> > busy-looping in __mm_populate(). AFAICS, the only way this can
> > occur
> > is if populate_vma_page_range() returns 0 and we don't advance the
> > iteration position (if it returned an error, we wouldn't reset nend
> > and move on to the next vma as ignore_errors is 1 for mlockall.)
> >
> > populate_vma_page_range() returns 0 when the first page is not
> > found
> > and faultin_page() returns -EBUSY (if it were processing pages, or
> > if
> > the error from faultin_page() would be a different one, we would
> > return the number of pages processed or -error).
> >
> > faultin_page() returns -EBUSY when VM_FAULT_RETRY is set, i.e. we
> > dropped the mmap_sem in order to initiate IO and require a retry.
> > That
> > is consistent with the bisect result (new VM_FAULT_RETRY
> > conditions).
> >
> > At this point, regular page fault would retry with FAULT_FLAG_TRIED
> > to
> > indicate that the mmap_sem cannot be dropped a second time. But
> > this
> > mlock path doesn't set that flag and we can loop repeatedly. That
> > is
> > something we probably need to fix with a FOLL_TRIED somewhere.
> >
> > What I don't quite understand yet is why the fault path doesn't
> > make
> > progress eventually. We must drop the mmap_sem without changing the
> > state in any way. How can we keep looping on the same page?
>
> I've played a bit around by adding some `printk` messages (see
> attached
> patch) and found exactly what you describe: it's busy-looping in
> __mm_populate(), because populate_vma_page_range returns 0.
>
> However, there's a slightly interesting thing in there. Before it
> loops
> forever, it processes
> 	nstart=5574d92e1000
> 	locked=1
> 	vma->vm_start=7f5e4bfec000
> 	vma->vm_end=  7f5e4c011000
> 	vma->vm_flags=8002071
> for which populate_vma_page_range() returns 1, then it processes this
> over and over again:
> 	nstart=7f5e4bfed000
> 	locked=0
> 	vma->vm_start=7f5e4bfec000  (same as before)
> 	vma->vm_end=  7f5e4c011000
> 	vma->vm_flags=8002071
> These are the additional dmesg messages with timestamp 105.x. At
> timestamp 106.x, I've hit ctrl-c (ret=-512).
>
> dmesg output with the patch applied (on top of the v5.3.8 git tag)
> attached.
>

View attachment "dmesg-out.txt" of type "text/plain" (32005 bytes)

View attachment "gup-printk.txt" of type "text/plain" (12272 bytes)