[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez0KWgLMOp1d3X1AcRNc4-eF1YiCw=PgWiGjtM6PqQqawg@mail.gmail.com>
Date: Wed, 8 Apr 2020 18:24:16 +0200
From: Jann Horn <jannh@...gle.com>
To: Christian Brauner <christian.brauner@...ntu.com>
Cc: Jens Axboe <axboe@...nel.dk>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
kernel list <linux-kernel@...r.kernel.org>,
linux-block@...r.kernel.org, Linux API <linux-api@...r.kernel.org>,
Jonathan Corbet <corbet@....net>,
Serge Hallyn <serge@...lyn.com>,
"Rafael J. Wysocki" <rafael@...nel.org>, Tejun Heo <tj@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Saravana Kannan <saravanak@...gle.com>,
Jan Kara <jack@...e.cz>, David Howells <dhowells@...hat.com>,
Seth Forshee <seth.forshee@...onical.com>,
David Rheinsberg <david.rheinsberg@...il.com>,
Tom Gundersen <teg@...m.no>,
Christian Kellner <ckellner@...hat.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
Stéphane Graber <stgraber@...ntu.com>,
linux-doc@...r.kernel.org,
Network Development <netdev@...r.kernel.org>,
Matthew Garrett <mjg59@...gle.com>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH 0/8] loopfs
On Wed, Apr 8, 2020 at 5:23 PM Christian Brauner
<christian.brauner@...ntu.com> wrote:
> One of the use-cases for loopfs is to allow to dynamically allocate loop
> devices in sandboxed workloads without exposing /dev or
> /dev/loop-control to the workload in question and without having to
> implement a complex and also racy protocol to send around file
> descriptors for loop devices. With loopfs each mount is a new instance,
> i.e. loop devices created in one loopfs instance are independent of any
> loop devices created in another loopfs instance. This allows
> sufficiently privileged tools to have their own private stash of loop
> device instances. Dmitry has expressed his desire to use this for
> syzkaller in a private discussion. And various parties that want to use
> it are Cced here too.
>
> In addition, the loopfs filesystem can be mounted by user namespace root
> and is thus suitable for use in containers. Combined with syscall
> interception this makes it possible to securely delegate mounting of
> images on loop devices, i.e. when a user calls mount -o loop <image>
> <mountpoint> it will be possible to completely setup the loop device.
> The final mount syscall to actually perform the mount will be handled
> through syscall interception and be performed by a sufficiently
> privileged process. Syscall interception is already supported through a
> new seccomp feature we implemented in [1] and extended in [2] and is
> actively used in production workloads. The additional loopfs work will
> be used there and in various other workloads too. You'll find a short
> illustration how this works with syscall interception below in [4].
Would that privileged process then allow you to mount your filesystem
images with things like ext4? As far as I know, the filesystem
maintainers don't generally consider "untrusted filesystem image" to
be a strongly enforced security boundary; and worse, if an attacker
has access to a loop device from which something like ext4 is mounted,
things like "struct ext4_dir_entry_2" will effectively be in shared
memory, and an attacker can trivially bypass e.g.
ext4_check_dir_entry(). At the moment, that's not a huge problem (for
anything other than kernel lockdown) because only root normally has
access to loop devices.
Ubuntu carries an out-of-tree patch that afaik blocks the shared
memory thing: <https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/eoan/commit?id=4bc428fdf5500b7366313f166b7c9c50ee43f2c4>
But even with that patch, I'm not super excited about exposing
filesystem image parsing attack surface to containers unless you run
the filesystem in a sandboxed environment (at which point you don't
need a loop device anymore either).
Powered by blists - more mailing lists