[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200724093852-mutt-send-email-mst@kernel.org>
Date: Fri, 24 Jul 2020 09:40:07 -0400
From: "Michael S. Tsirkin" <mst@...hat.com>
To: Nick Kralevich <nnk@...gle.com>
Cc: Lokesh Gidra <lokeshgidra@...gle.com>,
Jeffrey Vander Stoep <jeffv@...gle.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Kees Cook <keescook@...omium.org>,
Daniel Colascione <dancol@...gle.com>,
Jonathan Corbet <corbet@....net>,
Alexander Viro <viro@...iv.linux.org.uk>,
Luis Chamberlain <mcgrof@...nel.org>,
Iurii Zaikin <yzaikin@...gle.com>,
Mauro Carvalho Chehab <mchehab+samsung@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Andy Shevchenko <andy.shevchenko@...il.com>,
Vlastimil Babka <vbabka@...e.cz>,
Mel Gorman <mgorman@...hsingularity.net>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Peter Xu <peterx@...hat.com>,
Mike Rapoport <rppt@...ux.ibm.com>,
Jerome Glisse <jglisse@...hat.com>, Shaohua Li <shli@...com>,
linux-doc@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
Linux FS Devel <linux-fsdevel@...r.kernel.org>,
Tim Murray <timmurray@...gle.com>,
Minchan Kim <minchan@...gle.com>,
Sandeep Patil <sspatil@...gle.com>, kernel@...roid.com,
Daniel Colascione <dancol@...col.org>,
Kalesh Singh <kaleshsingh@...gle.com>
Subject: Re: [PATCH 2/2] Add a new sysctl knob:
unprivileged_userfaultfd_user_mode_only
On Thu, Jul 23, 2020 at 05:13:28PM -0700, Nick Kralevich wrote:
> On Thu, Jul 23, 2020 at 10:30 AM Lokesh Gidra <lokeshgidra@...gle.com> wrote:
> > From the discussion so far it seems that there is a consensus that
> > patch 1/2 in this series should be upstreamed in any case. Is there
> > anything that is pending on that patch?
>
> That's my reading of this thread too.
>
> > > > Unless I'm mistaken that you can already enforce bit 1 of the second
> > > > parameter of the userfaultfd syscall to be set with seccomp-bpf, this
> > > > would be more a question to the Android userland team.
> > > >
> > > > The question would be: does it ever happen that a seccomp filter isn't
> > > > already applied to unprivileged software running without
> > > > SYS_CAP_PTRACE capability?
> > >
> > > Yes.
> > >
> > > Android uses selinux as our primary sandboxing mechanism. We do use
> > > seccomp on a few processes, but we have found that it has a
> > > surprisingly high performance cost [1] on arm64 devices so turning it
> > > on system wide is not a good option.
> > >
> > > [1] https://lore.kernel.org/linux-security-module/202006011116.3F7109A@keescook/T/#m82ace19539ac595682affabdf652c0ffa5d27dad
>
> As Jeff mentioned, seccomp is used strategically on Android, but is
> not applied to all processes. It's too expensive and impractical when
> simpler implementations (such as this sysctl) can exist. It's also
> significantly simpler to test a sysctl value for correctness as
> opposed to a seccomp filter.
Given that selinux is already used system-wide on Android, what is wrong
with using selinux to control userfaultfd as opposed to seccomp?
> > > >
> > > >
> > > > If answer is "no" the behavior of the new sysctl in patch 2/2 (in
> > > > subject) should be enforceable with minor changes to the BPF
> > > > assembly. Otherwise it'd require more changes.
>
> It would be good to understand what these changes are.
>
> > > > Why exactly is it preferable to enlarge the surface of attack of the
> > > > kernel and take the risk there is a real bug in userfaultfd code (not
> > > > just a facilitation of exploiting some other kernel bug) that leads to
> > > > a privilege escalation, when you still break 99% of userfaultfd users,
> > > > if you set with option "2"?
>
> I can see your point if you think about the feature as a whole.
> However, distributions (such as Android) have specialized knowledge of
> their security environments, and may not want to support the typical
> usages of userfaultfd. For such distributions, providing a mechanism
> to prevent userfaultfd from being useful as an exploit primitive,
> while still allowing the very limited use of userfaultfd for userspace
> faults only, is desirable. Distributions shouldn't be forced into
> supporting 100% of the use cases envisioned by userfaultfd when their
> needs may be more specialized, and this sysctl knob empowers
> distributions to make this choice for themselves.
>
> > > > Is the system owner really going to purely run on his systems CRIU
> > > > postcopy live migration (which already runs with CAP_SYS_PTRACE) and
> > > > nothing else that could break?
>
> This is a great example of a capability which a distribution may not
> want to support, due to distribution specific security policies.
>
> > > >
> > > > Option "2" to me looks with a single possible user, and incidentally
> > > > this single user can already enforce model "2" by only tweaking its
> > > > seccomp-bpf filters without applying 2/2. It'd be a bug if android
> > > > apps runs unprotected by seccomp regardless of 2/2.
>
> Can you elaborate on what bug is present by processes being
> unprotected by seccomp?
>
> Seccomp cannot be universally applied on Android due to previously
> mentioned performance concerns. Seccomp is used in Android primarily
> as a tool to enforce the list of allowed syscalls, so that such
> syscalls can be audited before being included as part of the Android
> API.
>
> -- Nick
>
> --
> Nick Kralevich | nnk@...gle.com
Powered by blists - more mailing lists