lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a95f9ae6-f7db-1ed9-6e25-99ced1fd37a3@gmail.com>
Date:   Thu, 27 Apr 2017 19:26:16 +0200
From:   "Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
To:     Mike Rapoport <rppt@...ux.vnet.ibm.com>
Cc:     mtk.manpages@...il.com, Andrea Arcangeli <aarcange@...hat.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        linux-man@...r.kernel.org
Subject: Re: [PATCH man-pages 1/2] userfaultfd.2: start documenting
 non-cooperative events

Hi Mike,

I've applied this, but have some questions/points I think 
further clarification.

On 04/27/2017 04:14 PM, Mike Rapoport wrote:
> Signed-off-by: Mike Rapoport <rppt@...ux.vnet.ibm.com>
> ---
>  man2/userfaultfd.2 | 135 ++++++++++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 128 insertions(+), 7 deletions(-)
> 
> diff --git a/man2/userfaultfd.2 b/man2/userfaultfd.2
> index cfea5cb..44af3e4 100644
> --- a/man2/userfaultfd.2
> +++ b/man2/userfaultfd.2
> @@ -75,7 +75,7 @@ flag in
>  .PP
>  When the last file descriptor referring to a userfaultfd object is closed,
>  all memory ranges that were registered with the object are unregistered
> -and unread page-fault events are flushed.
> +and unread events are flushed.
>  .\"
>  .SS Usage
>  The userfaultfd mechanism is designed to allow a thread in a multithreaded
> @@ -99,6 +99,20 @@ In such non-cooperative mode,
>  the process that monitors userfaultfd and handles page faults
>  needs to be aware of the changes in the virtual memory layout
>  of the faulting process to avoid memory corruption.
> +
> +Starting from Linux 4.11,
> +userfaultfd may notify the fault-handling threads about changes
> +in the virtual memory layout of the faulting process.
> +In addition, if the faulting process invokes
> +.BR fork (2)
> +system call,
> +the userfaultfd objects associated with the parent may be duplicated
> +into the child process and the userfaultfd monitor will be notified
> +about the file descriptor associated with the userfault objects

What does "notified about the file descriptor" mean?

> +created for the child process,
> +which allows userfaultfd monitor to perform user-space paging
> +for the child process.
> +
>  .\" FIXME elaborate about non-cooperating mode, describe its limitations
>  .\" for kernels before 4.11, features added in 4.11
>  .\" and limitations remaining in 4.11
> @@ -144,6 +158,10 @@ Details of the various
>  operations can be found in
>  .BR ioctl_userfaultfd (2).
>  
> +Since Linux 4.11, events other than page-fault may enabled during
> +.B UFFDIO_API
> +operation.
> +
>  Up to Linux 4.11,
>  userfaultfd can be used only with anonymous private memory mappings.
>  
> @@ -156,7 +174,8 @@ Each
>  .BR read (2)
>  from the userfaultfd file descriptor returns one or more
>  .I uffd_msg
> -structures, each of which describes a page-fault event:
> +structures, each of which describes a page-fault event
> +or an event required for the non-cooperative userfaultfd usage:
>  
>  .nf
>  .in +4n
> @@ -168,6 +187,23 @@ struct uffd_msg {
>              __u64 flags;        /* Flags describing fault */
>              __u64 address;      /* Faulting address */
>          } pagefault;
> +        struct {
> +            __u32 ufd;          /* userfault file descriptor
> +                                   of the child process */
> +        } fork;                 /* since Linux 4.11 */
> +        struct {
> +            __u64 from;         /* old address of the
> +                                   remapped area */
> +            __u64 to;           /* new address of the
> +                                   remapped area */
> +            __u64 len;          /* original mapping length */
> +        } remap;                /* since Linux 4.11 */
> +        struct {
> +            __u64 start;        /* start address of the
> +                                   removed area */
> +            __u64 end;          /* end address of the
> +                                   removed area */
> +        } remove;               /* since Linux 4.11 */
>          ...
>      } arg;
>  
> @@ -194,14 +230,73 @@ structure are as follows:
>  .TP
>  .I event
>  The type of event.
> -Currently, only one value can appear in this field:
> -.BR UFFD_EVENT_PAGEFAULT ,
> -which indicates a page-fault event.
> +Depending of the event type,
> +different fields of the
> +.I arg
> +union represent details required for the event processing.
> +The non-page-fault events are generated only when appropriate feature
> +is enabled during API handshake with
> +.B UFFDIO_API
> +.BR ioctl (2).
> +
> +The following values can appear in the
> +.I event
> +field:
> +.RS
> +.TP
> +.B UFFD_EVENT_PAGEFAULT
> +A page-fault event.
> +The page-fault details are available in the
> +.I pagefault
> +field.
>  .TP
> -.I address
> +.B UFFD_EVENT_FORK
> +Generated when the faulting process invokes
> +.BR fork (2)
> +system call.
> +The event details are available in the
> +.I fork
> +field.
> +.\" FIXME descirbe duplication of userfault file descriptor during fork
> +.TP
> +.B UFFD_EVENT_REMAP
> +Generated when the faulting process invokes
> +.BR mremap (2)
> +system call.
> +The event details are available in the
> +.I remap
> +field.
> +.TP
> +.B UFFD_EVENT_REMOVE
> +Generated when the faulting process invokes
> +.BR madvise (2)
> +system call with
> +.BR MADV_DONTNEED
> +or
> +.BR MADV_REMOVE
> +advice.
> +The event details are available in the
> +.I remove
> +field.
> +.TP
> +.B UFFD_EVENT_UNMAP
> +Generated when the faulting process unmaps a memory range,
> +either explicitly using
> +.BR munmap (2)
> +system call or implicitly during
> +.BR mmap (2)
> +or
> +.BR mremap (2)
> +system calls.
> +The event details are available in the
> +.I remove
> +field.
> +.RE
> +.TP
> +.I pagefault.address
>  The address that triggered the page fault.
>  .TP
> -.I flags
> +.I pagefault.flags
>  A bit mask of flags that describe the event.
>  For
>  .BR UFFD_EVENT_PAGEFAULT ,
> @@ -218,6 +313,32 @@ otherwise it is a read fault.
>  .\"
>  .\" UFFD_PAGEFAULT_FLAG_WP is not yet supported.
>  .RE
> +.TP
> +.I fork.ufd
> +The file descriptor associated with the userfault object
> +created for the child process
> +.TP
> +.I remap.from
> +The original address of the memory range that was remapped using
> +.BR mremap (2).
> +.TP
> +.I remap.to
> +The new address of the memory range that was remapped using
> +.BR mremap (2).
> +.TP
> +.I remap.len
> +The original length of the the memory range that was remapped using
> +.BR mremap (2).
> +.TP
> +.I remove.start
> +The start address of the memory range that was freed using
> +.BR madvise (2)
> +or unmapped
> +.TP
> +.I remove.end
> +The end address of the memory range that was freed using
> +.BR madvise (2)
> +or unmapped
>  .PP
>  A
>  .BR read (2)

Cheers,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ