linux-kernel - Re: Sealed memfd & no-fault mmap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d9rpd_hm_ereswX76EqjEGkqfjFFSi-N_yj8b1pj4MZMFy-fpiicN_XrHl13sXqkkgzAJqZEy1roQsVklWEhY38-olslcbO34GB0YcjHks8=@emersion.fr>
Date:   Thu, 03 Jun 2021 13:14:47 +0000
From:   Simon Ser <contact@...rsion.fr>
To:     Hugh Dickins <hughd@...gle.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        "Lin, Ming" <minggr@...il.com>, Peter Xu <peterx@...hat.com>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Matthew Wilcox <willy@...radead.org>,
        Dan Williams <dan.j.williams@...el.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Will Deacon <will@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        David Herrmann <dh.herrmann@...il.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        Greg Kroah-Hartman <greg@...ah.com>,
        "tytso@....edu" <tytso@....edu>
Subject: Re: Sealed memfd & no-fault mmap

On Saturday, May 29th, 2021 at 10:15 PM, Hugh Dickins <hughd@...gle.com> wrote:

> And IIUC it would have to be the recipient (Wayland compositor) doing
> the NOFAULT business, because (going back to the original mail) we are
> only considering this so that Wayland might satisfy clients who predate
> or refuse Linux-only APIs.  So, an ioctl (or fcntl, as sealing chose)
> at the client end cannot be expected; and could not be relied on anyway.

Yes, that is correct.

> NOFAULT? Does BSD use "fault" differently, and in Linux terms we
> would say NOSIGBUS to mean the same?
>
> Can someone point to a specification of BSD's __MAP_NOFAULT?
> Searching just found me references to bugs.

__MAP_NOFAULT isn't documented, sadly. The commit that introduces the
flag [1] is the best we're going to get, I think.

> What mainly worries me about the suggestion is: what happens to the
> zero page inserted into NOFAULT mappings, when later a page for that
> offset is created and added to page cache?

Not 100% sure exactly this means what I think it means, but from my PoV,
it's fine if the contents of an expanded shm file aren't visible from the
process that has mapped it with MAP_NOFAULT/MAP_NOSIGBUS. In other words,
it's fine if:

- The client sets up a 1KiB shm file and sends it to the compositor.
- The compositor maps it with MAP_NOFAULT/MAP_NOSIGBUS.
- The client expands the file to 2KiB and writes interesting data in it.
- The compositor still sees zeros past the 1KiB mark. The compositor needs
  to unmap and re-map the file to see the data past the 1KiB mark.

If the MAP_NOFAULT/MAP_NOSIGBUS flag only affects the mapping itself and
nothing else, this should be fine?

> Treating it as an opaque blob of zeroes, that stays there ever after,
> hiding the subsequent data: easy to implement, but a hack that we would
> probably regret.  (And I notice that even the quote from David Herrmann
> in the original post allows for the possibility that client may want to
> expand the object.)
>
> I believe the correct behaviour would be to unmap the nofault page
> then, allowing the proper page to be faulted in after.  That is
> certainly doable (the old mm/filemap_xip.c used to do so), but might
> get into some awkward race territory, with filesystem dependence
> (reminiscent of hole punch, in reverse).  shmem could operate that
> way, and be the better for it: but I wouldn't want to add that,
> without also cleaning away all the shmem_recalc_inode() stuff.

[1]: https://github.com/openbsd/src/commit/37f480c7e4870332b7ffb802fa6578f547c8a19f