lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <diqzmt3sqxut.fsf@ackerleytng-cloudtop.c.googlers.com>
Date:   Fri, 31 Mar 2023 23:56:10 +0000
From:   Ackerley Tng <ackerleytng@...gle.com>
To:     Christian Brauner <brauner@...nel.org>
Cc:     kvm@...r.kernel.org, linux-api@...r.kernel.org,
        linux-arch@...r.kernel.org, linux-doc@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, qemu-devel@...gnu.org, aarcange@...hat.com,
        ak@...ux.intel.com, akpm@...ux-foundation.org, arnd@...db.de,
        bfields@...ldses.org, bp@...en8.de, chao.p.peng@...ux.intel.com,
        corbet@....net, dave.hansen@...el.com, david@...hat.com,
        ddutile@...hat.com, dhildenb@...hat.com, hpa@...or.com,
        hughd@...gle.com, jlayton@...nel.org, jmattson@...gle.com,
        joro@...tes.org, jun.nakajima@...el.com,
        kirill.shutemov@...ux.intel.com, linmiaohe@...wei.com,
        luto@...nel.org, mail@...iej.szmigiero.name, mhocko@...e.com,
        michael.roth@....com, mingo@...hat.com, naoya.horiguchi@....com,
        pbonzini@...hat.com, qperret@...gle.com, rppt@...nel.org,
        seanjc@...gle.com, shuah@...nel.org, steven.price@....com,
        tabba@...gle.com, tglx@...utronix.de, vannapurve@...gle.com,
        vbabka@...e.cz, vkuznets@...hat.com, wanpengli@...cent.com,
        wei.w.wang@...el.com, x86@...nel.org, yu.c.zhang@...ux.intel.com
Subject: Re: [RFC PATCH v2 1/2] mm: restrictedmem: Allow userspace to specify
 mount for memfd_restricted

Christian Brauner <brauner@...nel.org> writes:

> On Tue, Mar 21, 2023 at 08:15:32PM +0000, Ackerley Tng wrote:
>> By default, the backing shmem file for a restrictedmem fd is created
>> on shmem's kernel space mount.

>> ...

Thanks for reviewing this patch!


> This looks like you can just pass in some tmpfs fd and you just use it
> to identify the mnt and then you create a restricted memfd area in that
> instance. So if I did:

> mount -t tmpfs tmpfs /mnt
> mknod /mnt/bla c 0 0
> fd = open("/mnt/bla")
> memfd_restricted(fd)

> then it would create a memfd restricted entry in the tmpfs instance
> using the arbitrary dummy device node to infer the tmpfs instance.

> Looking at the older thread briefly and the cover letter. Afaict, the
> new mount api shouldn't figure into the design of this. fsopen() returns
> fds referencing a VFS-internal fs_context object. They can't be used to
> create or lookup files or identify mounts. The mount doesn't exist at
> that time. Not even a superblock might exist at the time before
> fsconfig(FSCONFIG_CMD_CREATE).

> When fsmount() is called after superblock setup then it's similar to any
> other fd from open() or open_tree() or whatever (glossing over some
> details that are irrelevant here). Difference is that open_tree() and
> fsmount() would refer to the root of a mount.

This is correct, memfd_restricted() needs an fd returned from fsmount()
and not fsopen(). Usage examples of this new parameter in
memfd_restricted() are available in selftests.


> At first I wondered why this doesn't just use standard *at() semantics
> but I guess the restricted memfd is unlinked and doesn't show up in the
> tmpfs instance.

> So if you go down that route then I would suggest to enforce that the
> provided fd refer to the root of a tmpfs mount. IOW, it can't just be an
> arbitrary file descriptor in a tmpfs instance. That seems cleaner to me:

> sb = f_path->mnt->mnt_sb;
> sb->s_magic == TMPFS_MAGIC && f_path->mnt->mnt_root == sb->s_root

> and has much tigher semantics than just allowing any kind of fd.

Thanks for your suggestion, I've tightened the semantics as you
suggested. memfd_restricted() now only accepts fds representing the root
of the mount.


> Another wrinkly I find odd but that's for you to judge is that this
> bypasses the permission model of the tmpfs instance. IOW, as long as you
> have a handle to the root of a tmpfs mount you can just create
> restricted memfds in there. So if I provided a completely sandboxed
> service - running in a user namespace or whatever - with an fd to the
> host's tmpfs instance they can just create restricted memfds in there no
> questions asked.

> Maybe that's fine but it's certainly something to spell out and think
> about the implications.

Thanks for pointing this out! I added a permissions check in RFC v3, and
clarified the permissions model (please see patch 1 of 2):
https://lore.kernel.org/lkml/cover.1680306489.git.ackerleytng@google.com/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ