linux-kernel - Re: [PATCH] mm/gup: continue VM_FAULT_RETRY processing event for pre-faults

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190526172509.GC1282@xo-6d-61-c0.localdomain>
Date:   Sun, 26 May 2019 19:25:09 +0200
From:   Pavel Machek <pavel@....cz>
To:     Hugh Dickins <hughd@...gle.com>
Cc:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mike Rapoport <rppt@...ux.ibm.com>,
        Andrea Arcangeli <aarcange@...hat.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Borislav Petkov <bp@...e.de>
Subject: Re: [PATCH] mm/gup: continue VM_FAULT_RETRY processing event for
 pre-faults

On Fri 2019-05-24 15:22:51, Hugh Dickins wrote:
> On Wed, 22 May 2019, Sebastian Andrzej Siewior wrote:
> > On 2019-05-22 12:21:13 [-0700], Andrew Morton wrote:
> > > On Tue, 14 May 2019 17:29:55 +0300 Mike Rapoport <rppt@...ux.ibm.com> wrote:
> > > 
> > > > When get_user_pages*() is called with pages = NULL, the processing of
> > > > VM_FAULT_RETRY terminates early without actually retrying to fault-in all
> > > > the pages.
> > > > 
> > > > If the pages in the requested range belong to a VMA that has userfaultfd
> > > > registered, handle_userfault() returns VM_FAULT_RETRY *after* user space
> > > > has populated the page, but for the gup pre-fault case there's no actual
> > > > retry and the caller will get no pages although they are present.
> > > > 
> > > > This issue was uncovered when running post-copy memory restore in CRIU
> > > > after commit d9c9ce34ed5c ("x86/fpu: Fault-in user stack if
> > > > copy_fpstate_to_sigframe() fails").
> 
> I've been getting unexplained segmentation violations, and "make" giving
> up early, when running kernel builds under swapping memory pressure: no
> CRIU involved.
> 
> Bisected last night to that same x86/fpu commit, not itself guilty, but
> suffering from the odd behavior of get_user_pages_unlocked() giving up
> too early.
> 
> (I wondered at first if copy_fpstate_to_sigframe() ought to retry if
> non-negative ret < nr_pages, but no, that would be wrong: a present page
> followed by an invalid area would repeatedly return 1 for nr_pages 2.)
> 
> Cc'ing Pavel, who's been having segfault trouble in emacs: maybe same?

The emacs segfault was always during process exit. This sounds different...

I don't see problems with make.

But its true that at least one of affected machines uses swap heavily.

Best regards,
								Pavel