lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 27 Apr 2017 17:14:33 +0300
From:   Mike Rapoport <rppt@...ux.vnet.ibm.com>
To:     "Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
Cc:     Andrea Arcangeli <aarcange@...hat.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        linux-man@...r.kernel.org, Mike Rapoport <rppt@...ux.vnet.ibm.com>
Subject: [PATCH man-pages 1/2] userfaultfd.2: start documenting non-cooperative events

Signed-off-by: Mike Rapoport <rppt@...ux.vnet.ibm.com>
---
 man2/userfaultfd.2 | 135 ++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 128 insertions(+), 7 deletions(-)

diff --git a/man2/userfaultfd.2 b/man2/userfaultfd.2
index cfea5cb..44af3e4 100644
--- a/man2/userfaultfd.2
+++ b/man2/userfaultfd.2
@@ -75,7 +75,7 @@ flag in
 .PP
 When the last file descriptor referring to a userfaultfd object is closed,
 all memory ranges that were registered with the object are unregistered
-and unread page-fault events are flushed.
+and unread events are flushed.
 .\"
 .SS Usage
 The userfaultfd mechanism is designed to allow a thread in a multithreaded
@@ -99,6 +99,20 @@ In such non-cooperative mode,
 the process that monitors userfaultfd and handles page faults
 needs to be aware of the changes in the virtual memory layout
 of the faulting process to avoid memory corruption.
+
+Starting from Linux 4.11,
+userfaultfd may notify the fault-handling threads about changes
+in the virtual memory layout of the faulting process.
+In addition, if the faulting process invokes
+.BR fork (2)
+system call,
+the userfaultfd objects associated with the parent may be duplicated
+into the child process and the userfaultfd monitor will be notified
+about the file descriptor associated with the userfault objects
+created for the child process,
+which allows userfaultfd monitor to perform user-space paging
+for the child process.
+
 .\" FIXME elaborate about non-cooperating mode, describe its limitations
 .\" for kernels before 4.11, features added in 4.11
 .\" and limitations remaining in 4.11
@@ -144,6 +158,10 @@ Details of the various
 operations can be found in
 .BR ioctl_userfaultfd (2).
 
+Since Linux 4.11, events other than page-fault may enabled during
+.B UFFDIO_API
+operation.
+
 Up to Linux 4.11,
 userfaultfd can be used only with anonymous private memory mappings.
 
@@ -156,7 +174,8 @@ Each
 .BR read (2)
 from the userfaultfd file descriptor returns one or more
 .I uffd_msg
-structures, each of which describes a page-fault event:
+structures, each of which describes a page-fault event
+or an event required for the non-cooperative userfaultfd usage:
 
 .nf
 .in +4n
@@ -168,6 +187,23 @@ struct uffd_msg {
             __u64 flags;        /* Flags describing fault */
             __u64 address;      /* Faulting address */
         } pagefault;
+        struct {
+            __u32 ufd;          /* userfault file descriptor
+                                   of the child process */
+        } fork;                 /* since Linux 4.11 */
+        struct {
+            __u64 from;         /* old address of the
+                                   remapped area */
+            __u64 to;           /* new address of the
+                                   remapped area */
+            __u64 len;          /* original mapping length */
+        } remap;                /* since Linux 4.11 */
+        struct {
+            __u64 start;        /* start address of the
+                                   removed area */
+            __u64 end;          /* end address of the
+                                   removed area */
+        } remove;               /* since Linux 4.11 */
         ...
     } arg;
 
@@ -194,14 +230,73 @@ structure are as follows:
 .TP
 .I event
 The type of event.
-Currently, only one value can appear in this field:
-.BR UFFD_EVENT_PAGEFAULT ,
-which indicates a page-fault event.
+Depending of the event type,
+different fields of the
+.I arg
+union represent details required for the event processing.
+The non-page-fault events are generated only when appropriate feature
+is enabled during API handshake with
+.B UFFDIO_API
+.BR ioctl (2).
+
+The following values can appear in the
+.I event
+field:
+.RS
+.TP
+.B UFFD_EVENT_PAGEFAULT
+A page-fault event.
+The page-fault details are available in the
+.I pagefault
+field.
 .TP
-.I address
+.B UFFD_EVENT_FORK
+Generated when the faulting process invokes
+.BR fork (2)
+system call.
+The event details are available in the
+.I fork
+field.
+.\" FIXME descirbe duplication of userfault file descriptor during fork
+.TP
+.B UFFD_EVENT_REMAP
+Generated when the faulting process invokes
+.BR mremap (2)
+system call.
+The event details are available in the
+.I remap
+field.
+.TP
+.B UFFD_EVENT_REMOVE
+Generated when the faulting process invokes
+.BR madvise (2)
+system call with
+.BR MADV_DONTNEED
+or
+.BR MADV_REMOVE
+advice.
+The event details are available in the
+.I remove
+field.
+.TP
+.B UFFD_EVENT_UNMAP
+Generated when the faulting process unmaps a memory range,
+either explicitly using
+.BR munmap (2)
+system call or implicitly during
+.BR mmap (2)
+or
+.BR mremap (2)
+system calls.
+The event details are available in the
+.I remove
+field.
+.RE
+.TP
+.I pagefault.address
 The address that triggered the page fault.
 .TP
-.I flags
+.I pagefault.flags
 A bit mask of flags that describe the event.
 For
 .BR UFFD_EVENT_PAGEFAULT ,
@@ -218,6 +313,32 @@ otherwise it is a read fault.
 .\"
 .\" UFFD_PAGEFAULT_FLAG_WP is not yet supported.
 .RE
+.TP
+.I fork.ufd
+The file descriptor associated with the userfault object
+created for the child process
+.TP
+.I remap.from
+The original address of the memory range that was remapped using
+.BR mremap (2).
+.TP
+.I remap.to
+The new address of the memory range that was remapped using
+.BR mremap (2).
+.TP
+.I remap.len
+The original length of the the memory range that was remapped using
+.BR mremap (2).
+.TP
+.I remove.start
+The start address of the memory range that was freed using
+.BR madvise (2)
+or unmapped
+.TP
+.I remove.end
+The end address of the memory range that was freed using
+.BR madvise (2)
+or unmapped
 .PP
 A
 .BR read (2)
-- 
1.9.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ