lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 24 May 2018 19:40:07 +0300
From:   Pavel Emelyanov <xemul@...tuozzo.com>
To:     Mike Rapoport <rppt@...ux.vnet.ibm.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        linux-mm <linux-mm@...ck.org>,
        lkml <linux-kernel@...r.kernel.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Andrei Vagin <avagin@...tuozzo.com>
Subject: Re: [PATCH] userfaultfd: prevent non-cooperative events vs
 mcopy_atomic races

On 05/24/2018 02:56 PM, Mike Rapoport wrote:
> On Thu, May 24, 2018 at 02:24:37PM +0300, Pavel Emelyanov wrote:
>> On 05/23/2018 10:42 AM, Mike Rapoport wrote:
>>> If a process monitored with userfaultfd changes it's memory mappings or
>>> forks() at the same time as uffd monitor fills the process memory with
>>> UFFDIO_COPY, the actual creation of page table entries and copying of the
>>> data in mcopy_atomic may happen either before of after the memory mapping
>>> modifications and there is no way for the uffd monitor to maintain
>>> consistent view of the process memory layout.
>>>
>>> For instance, let's consider fork() running in parallel with
>>> userfaultfd_copy():
>>>
>>> process        		         |	uffd monitor
>>> ---------------------------------+------------------------------
>>> fork()        		         | userfaultfd_copy()
>>> ...        		         | ...
>>>     dup_mmap()        	         |     down_read(mmap_sem)
>>>     down_write(mmap_sem)         |     /* create PTEs, copy data */
>>>         dup_uffd()               |     up_read(mmap_sem)
>>>         copy_page_range()        |
>>>         up_write(mmap_sem)       |
>>>         dup_uffd_complete()      |
>>>             /* notify monitor */ |
>>>
>>> If the userfaultfd_copy() takes the mmap_sem first, the new page(s) will be
>>> present by the time copy_page_range() is called and they will appear in the
>>> child's memory mappings. However, if the fork() is the first to take the
>>> mmap_sem, the new pages won't be mapped in the child's address space.
>>
>> But in this case child should get an entry, that emits a message to uffd when step upon!
>> And uffd will just userfaultfd_copy() it again. No?
>  
> There will be a message, indeed. But there is no way for monitor to tell
> whether the pages it copied are present or not in the child.

If there's a message, then they are not present, that's for sure :)

> Since the monitor cannot assume that the process will access all its memory
> it has to copy some pages "in the background". A simple monitor may look
> like:
> 
> 	for (;;) {
> 		wait_for_uffd_events(timeout);
> 		handle_uffd_events();
> 		uffd_copy(some not faulted pages);
> 	}
> 
> Then, if the "background" uffd_copy() races with fork, the pages we've
> copied may be already present in parent's mappings before the call to
> copy_page_range() and may be not.
> 
> If the pages were not present, uffd_copy'ing them again to the child's
> memory would be ok.

Yes.

> But if uffd_copy() was first to catch mmap_sem, and we would uffd_copy them
> again, child process will get memory corruption.

You mean the background uffd_copy()? But doesn't it race even with regular PF handling,
not only the fork? How do we handle this race?

-- Pavel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ