[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200826110136.GA69706@kernel.org>
Date:   Wed, 26 Aug 2020 14:01:36 +0300
From:   Mike Rapoport <rppt@...nel.org>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     Alexander Viro <viro@...iv.linux.org.uk>,
        Andy Lutomirski <luto@...nel.org>,
        Arnd Bergmann <arnd@...db.de>, Borislav Petkov <bp@...en8.de>,
        Catalin Marinas <catalin.marinas@....com>,
        Christopher Lameter <cl@...ux.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Elena Reshetova <elena.reshetova@...el.com>,
        "H. Peter Anvin" <hpa@...or.com>, Idan Yaniv <idan.yaniv@....com>,
        Ingo Molnar <mingo@...hat.com>,
        James Bottomley <jejb@...ux.ibm.com>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Matthew Wilcox <willy@...radead.org>,
        Mark Rutland <mark.rutland@....com>,
        Mike Rapoport <rppt@...ux.ibm.com>,
        Michael Kerrisk <mtk.manpages@...il.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Tycho Andersen <tycho@...ho.ws>, Will Deacon <will@...nel.org>,
        linux-api@...r.kernel.org, linux-arch@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org,
        linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, linux-nvdimm@...ts.01.org,
        linux-riscv@...ts.infradead.org, x86@...nel.org
Subject: Re: [PATCH v4 0/6] mm: introduce memfd_secret system call to create
 "secret" memory areas
Any comments on this?
On Tue, Aug 18, 2020 at 05:15:48PM +0300, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@...ux.ibm.com>
> 
> Hi,
> 
> This is an implementation of "secret" mappings backed by a file descriptor. 
> 
> v4 changes:
> * rebase on v5.9-rc1
> * Do not redefine PMD_PAGE_ORDER in fs/dax.c, thanks Kirill
> * Make secret mappings exclusive by default and only require flags to
>   memfd_secret() system call for uncached mappings, thanks again Kirill :)
> 
> v3 changes:
> * Squash kernel-parameters.txt update into the commit that added the
>   command line option.
> * Make uncached mode explicitly selectable by architectures. For now enable
>   it only on x86.
> 
> v2 changes:
> * Follow Michael's suggestion and name the new system call 'memfd_secret'
> * Add kernel-parameters documentation about the boot option
> * Fix i386-tinyconfig regression reported by the kbuild bot.
>   CONFIG_SECRETMEM now depends on !EMBEDDED to disable it on small systems
>   from one side and still make it available unconditionally on
>   architectures that support SET_DIRECT_MAP.
> 
> 
> The file descriptor backing secret memory mappings is created using a
> dedicated memfd_secret system call The desired protection mode for the
> memory is configured using flags parameter of the system call. The mmap()
> of the file descriptor created with memfd_secret() will create a "secret"
> memory mapping. The pages in that mapping will be marked as not present in
> the direct map and will have desired protection bits set in the user page
> table. For instance, current implementation allows uncached mappings.
> 
> Although normally Linux userspace mappings are protected from other users, 
> such secret mappings are useful for environments where a hostile tenant is
> trying to trick the kernel into giving them access to other tenants
> mappings.
> 
> Additionally, the secret mappings may be used as a mean to protect guest
> memory in a virtual machine host.
> 
> For demonstration of secret memory usage we've created a userspace library
> [1] that does two things: the first is act as a preloader for openssl to
> redirect all the OPENSSL_malloc calls to secret memory meaning any secret
> keys get automatically protected this way and the other thing it does is
> expose the API to the user who needs it. We anticipate that a lot of the
> use cases would be like the openssl one: many toolkits that deal with
> secret keys already have special handling for the memory to try to give
> them greater protection, so this would simply be pluggable into the
> toolkits without any need for user application modification.
> 
> I've hesitated whether to continue to use new flags to memfd_create() or to
> add a new system call and I've decided to use a new system call after I've
> started to look into man pages update. There would have been two completely
> independent descriptions and I think it would have been very confusing.
> 
> Hiding secret memory mappings behind an anonymous file allows (ab)use of
> the page cache for tracking pages allocated for the "secret" mappings as
> well as using address_space_operations for e.g. page migration callbacks.
> 
> The anonymous file may be also used implicitly, like hugetlb files, to
> implement mmap(MAP_SECRET) and use the secret memory areas with "native" mm
> ABIs in the future.
> 
> As the fragmentation of the direct map was one of the major concerns raised
> during the previous postings, I've added an amortizing cache of PMD-size
> pages to each file descriptor and an ability to reserve large chunks of the
> physical memory at boot time and then use this memory as an allocation pool
> for the secret memory areas.
> 
> v3: https://lore.kernel.org/lkml/20200804095035.18778-1-rppt@kernel.org
> v2: https://lore.kernel.org/lkml/20200727162935.31714-1-rppt@kernel.org
> v1: https://lore.kernel.org/lkml/20200720092435.17469-1-rppt@kernel.org/
> rfc-v2: https://lore.kernel.org/lkml/20200706172051.19465-1-rppt@kernel.org/
> rfc-v1: https://lore.kernel.org/lkml/20200130162340.GA14232@rapoport-lnx/
> 
> Mike Rapoport (6):
>   mm: add definition of PMD_PAGE_ORDER
>   mmap: make mlock_future_check() global
>   mm: introduce memfd_secret system call to create "secret" memory areas
>   arch, mm: wire up memfd_secret system call were relevant
>   mm: secretmem: use PMD-size pages to amortize direct map fragmentation
>   mm: secretmem: add ability to reserve memory at boot
> 
>  arch/Kconfig                           |   7 +
>  arch/arm64/include/asm/unistd.h        |   2 +-
>  arch/arm64/include/asm/unistd32.h      |   2 +
>  arch/arm64/include/uapi/asm/unistd.h   |   1 +
>  arch/riscv/include/asm/unistd.h        |   1 +
>  arch/x86/Kconfig                       |   1 +
>  arch/x86/entry/syscalls/syscall_32.tbl |   1 +
>  arch/x86/entry/syscalls/syscall_64.tbl |   1 +
>  fs/dax.c                               |  11 +-
>  include/linux/pgtable.h                |   3 +
>  include/linux/syscalls.h               |   1 +
>  include/uapi/asm-generic/unistd.h      |   7 +-
>  include/uapi/linux/magic.h             |   1 +
>  include/uapi/linux/secretmem.h         |   8 +
>  kernel/sys_ni.c                        |   2 +
>  mm/Kconfig                             |   4 +
>  mm/Makefile                            |   1 +
>  mm/internal.h                          |   3 +
>  mm/mmap.c                              |   5 +-
>  mm/secretmem.c                         | 451 +++++++++++++++++++++++++
>  20 files changed, 501 insertions(+), 12 deletions(-)
>  create mode 100644 include/uapi/linux/secretmem.h
>  create mode 100644 mm/secretmem.c
> 
> -- 
> 2.26.2
> 
-- 
Sincerely yours,
Mike.
Powered by blists - more mailing lists
 
