linux-kernel - Re: [PATCH] mm/gup: continue VM_FAULT_RETRY processing event for pre-faults

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20190522122113.a2edc8aba32f0fad189bae21@linux-foundation.org>
Date:   Wed, 22 May 2019 12:21:13 -0700
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     Mike Rapoport <rppt@...ux.ibm.com>
Cc:     Andrea Arcangeli <aarcange@...hat.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Borislav Petkov <bp@...e.de>
Subject: Re: [PATCH] mm/gup: continue VM_FAULT_RETRY processing event for
 pre-faults

On Tue, 14 May 2019 17:29:55 +0300 Mike Rapoport <rppt@...ux.ibm.com> wrote:

> When get_user_pages*() is called with pages = NULL, the processing of
> VM_FAULT_RETRY terminates early without actually retrying to fault-in all
> the pages.
> 
> If the pages in the requested range belong to a VMA that has userfaultfd
> registered, handle_userfault() returns VM_FAULT_RETRY *after* user space
> has populated the page, but for the gup pre-fault case there's no actual
> retry and the caller will get no pages although they are present.
> 
> This issue was uncovered when running post-copy memory restore in CRIU
> after commit d9c9ce34ed5c ("x86/fpu: Fault-in user stack if
> copy_fpstate_to_sigframe() fails").
> 
> After this change, the copying of FPU state to the sigframe switched from
> copy_to_user() variants which caused a real page fault to get_user_pages()
> with pages parameter set to NULL.

You're saying that argument buf_fx in copy_fpstate_to_sigframe() is NULL?

If so was that expected by the (now cc'ed) developers of
d9c9ce34ed5c8923 ("x86/fpu: Fault-in user stack if
copy_fpstate_to_sigframe() fails")?

It seems rather odd.  copy_fpregs_to_sigframe() doesn't look like it's
expecting a NULL argument.

Also, I wonder if copy_fpstate_to_sigframe() would be better using
fault_in_pages_writeable() rather than get_user_pages_unlocked().  That
seems like it operates at a more suitable level and I guess it will fix
this issue also.

> In post-copy mode of CRIU, the destination memory is managed with
> userfaultfd and lack of the retry for pre-fault case in get_user_pages()
> causes a crash of the restored process.
> 
> Making the pre-fault behavior of get_user_pages() the same as the "normal"
> one fixes the issue.

Should this be backported into -stable trees?

> Fixes: d9c9ce34ed5c ("x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails")
> Signed-off-by: Mike Rapoport <rppt@...ux.ibm.com>