[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YtcfqpmpkVXz/Frl@xz-m1.local>
Date: Tue, 19 Jul 2022 17:18:34 -0400
From: Peter Xu <peterx@...hat.com>
To: Axel Rasmussen <axelrasmussen@...gle.com>
Cc: Alexander Viro <viro@...iv.linux.org.uk>,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"Dmitry V . Levin" <ldv@...linux.org>,
Gleb Fotengauer-Malinovskiy <glebfm@...linux.org>,
Hugh Dickins <hughd@...gle.com>, Jan Kara <jack@...e.cz>,
Jonathan Corbet <corbet@....net>,
Mel Gorman <mgorman@...hsingularity.net>,
Mike Kravetz <mike.kravetz@...cle.com>,
Mike Rapoport <rppt@...nel.org>, Nadav Amit <namit@...are.com>,
Shuah Khan <shuah@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>,
Vlastimil Babka <vbabka@...e.cz>,
zhangyi <yi.zhang@...wei.com>, linux-doc@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linux-kselftest@...r.kernel.org
Subject: Re: [PATCH v4 2/5] userfaultfd: add /dev/userfaultfd for fine
grained access control
On Tue, Jul 19, 2022 at 12:56:25PM -0700, Axel Rasmussen wrote:
> Historically, it has been shown that intercepting kernel faults with
> userfaultfd (thereby forcing the kernel to wait for an arbitrary amount
> of time) can be exploited, or at least can make some kinds of exploits
> easier. So, in 37cd0575b8 "userfaultfd: add UFFD_USER_MODE_ONLY" we
> changed things so, in order for kernel faults to be handled by
> userfaultfd, either the process needs CAP_SYS_PTRACE, or this sysctl
> must be configured so that any unprivileged user can do it.
>
> In a typical implementation of a hypervisor with live migration (take
> QEMU/KVM as one such example), we do indeed need to be able to handle
> kernel faults. But, both options above are less than ideal:
>
> - Toggling the sysctl increases attack surface by allowing any
> unprivileged user to do it.
>
> - Granting the live migration process CAP_SYS_PTRACE gives it this
> ability, but *also* the ability to "observe and control the
> execution of another process [...], and examine and change [its]
> memory and registers" (from ptrace(2)). This isn't something we need
> or want to be able to do, so granting this permission violates the
> "principle of least privilege".
>
> This is all a long winded way to say: we want a more fine-grained way to
> grant access to userfaultfd, without granting other additional
> permissions at the same time.
>
> To achieve this, add a /dev/userfaultfd misc device. This device
> provides an alternative to the userfaultfd(2) syscall for the creation
> of new userfaultfds. The idea is, any userfaultfds created this way will
> be able to handle kernel faults, without the caller having any special
> capabilities. Access to this mechanism is instead restricted using e.g.
> standard filesystem permissions.
>
> Signed-off-by: Axel Rasmussen <axelrasmussen@...gle.com>
Thanks, this looks much better.
Acked-by: Peter Xu <peterx@...hat.com>
--
Peter Xu
Powered by blists - more mailing lists