[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1516785846.13558.106.camel@infradead.org>
Date: Wed, 24 Jan 2018 09:24:06 +0000
From: David Woodhouse <dwmw2@...radead.org>
To: Dominik Brodowski <linux@...inikbrodowski.net>,
Martin Schwidefsky <schwidefsky@...ibm.com>
Cc: linux-kernel@...r.kernel.org, linux-s390@...r.kernel.org,
kvm@...r.kernel.org, Heiko Carstens <heiko.carstens@...ibm.com>,
Christian Borntraeger <borntraeger@...ibm.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Cornelia Huck <cohuck@...hat.com>,
David Hildenbrand <david@...hat.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jon Masters <jcm@...hat.com>,
Marcus Meissner <meissner@...e.de>,
Jiri Kosina <jkosina@...e.cz>, w@....eu, keescook@...omium.org,
thomas.lendacky@....com, ak@...ux.intel.com, pavel@....cz
Subject: Re: Avoiding information leaks between users and between processes
by default? [Was: : [PATCH 1/5] prctl: add PR_ISOLATE_BP process control]
On Wed, 2018-01-24 at 09:37 +0100, Dominik Brodowski wrote:
> On Wed, Jan 24, 2018 at 07:29:53AM +0100, Martin Schwidefsky wrote:
> >
> > On Tue, 23 Jan 2018 18:07:19 +0100
> > Dominik Brodowski <linux@...inikbrodowski.net> wrote:
> >
> > >
> > > On Tue, Jan 23, 2018 at 02:07:01PM +0100, Martin Schwidefsky wrote:
> > > >
> > > > Add the PR_ISOLATE_BP operation to prctl. The effect of the process
> > > > control is to make all branch prediction entries created by the execution
> > > > of the user space code of this task not applicable to kernel code or the
> > > > code of any other task.
> > >
> > > What is the rationale for requiring a per-process *opt-in* for this added
> > > protection?
> > >
> > > For KPTI on x86, the exact opposite approach is being discussed (see, e.g.
> > > http://lkml.kernel.org/r/1515612500-14505-1-git-send-email-w@1wt.eu ): By
> > > default, play it safe, with KPTI enabled. But for "trusted" processes, one
> > > may opt out using prctrl.
> >
> > The rationale is that there are cases where you got code from *somewhere*
> > and want to run it in an isolated context. Think: a docker container that
> > runs under KVM. But with spectre this is still not really safe. So you
> > include a wrapper program in the docker container to use the trap door
> > prctl to start the potential malicious program. Now you should be good, no?
>
> Well, partly. It may be that s390 and its use cases are special -- but as I
> understand it, this uapi question goes beyond this question:
>
> To my understanding, Linux traditionally tried to aim for the security goal
> of avoiding information leaks *between* users[+], probably even between
> processes of the same user. It wasn't a guarantee, and there always were
> (and will be) information leaks -- and that is where additional safeguards
> such as seccomp come into play, which reduce the attack surface against
> unknown or unresolved security-related bugs. And everyone knew (or should
> have known) that allowing "untrusted" code to be run (be it by an user, be
> it JavaScript, etc.) is more risky. But still, avoiding information leaks
> between users and between processes was (to my understanding) at least a
> goal.[§]
>
> In recent days however, the outlook on this issue seems to have shifted:
>
> - Your proposal would mean to trust all userspace code, unless it is
> specifically marked as untrusted. As I understand it, this would mean that
> by default, spectre isn't fully mitigated cross-user and cross-process,
> though the kernel could. And rogue user-run code may make use of that,
> unless it is run with a special wrapper.
>
> - Concerning x86 and IPBP, the current proposal is to limit the protection
> offered by IPBP to non-dumpable processes. As I understand it, this would
> mean that other processes are left hanging out to dry.[~]
>
> - Concerning x86 and STIBP, David mentioned that "[t]here's an argument that
> there are so many other information leaks between HT siblings that we
> might not care"; in the last couple of hours, a proposal emerged to limit
> the protection offered by STIBP to non-dumpable processes as well. To my
> understanding, this would mean that many processes are left hanging out to
> dry again.
>
> I am a bit worried whether this is a sign for a shift in the security goals.
> I fully understand that there might be processes (e.g. some[?] kernel
> threads) and users (root) which you need to trust anyway, as they can
> already access anything. Disabling additional, costly safeguards for
> those special cases then seems OK. Opting out of additional protections for
> single-user or single-use systems (haproxy?) might make sense as well. But
> the kernel[*] not offering full[#] spectre mitigation by default for regular
> users and their processes? I'm not so sure.
Note that for STIBP/IBPB the operation of the flag is different in
another way. We're using it as a "protect this process from others"
flag, not a "protect others from this process" flag.
I'm not sure this is a fundamental shift in overall security goals;
more a recognition that on *current* hardware the cost of 100%
protection against an attack that was fairly unlikely in the first
place, is fairly prohibitive. For a process to make itself non-dumpable
is a simple enough way to opt in. And *maybe* we could contemplate a
command line option for 'IBPB always' but I'm *really* wary of exposing
too much of that stuff, rather than simply trying to Do The Right
Thing.
> [*] Whether CPUs should enable full mitigation (IBRS_ALL) by default
> in future has been discussed on this list as well.
The kernel will do that; it's just not implemented yet because it's
slightly non-trivial and can't be fully tested yet. We *will* want to
ALTERNATIVE away the retpolines and just set IBRS_ALL because it'll be
faster to do so.
For IBRS_ALL, note that we still need the same IBPB flushes on context
switch; just not STIBP. That's because IBRS_ALL, as Linus so eloquently
reminded us, is *still* a stop-gap measure and not actually a fix.
Reading between the lines, I think tagging predictions with the ring
(and HT sibling?) they came from is the best they could slip into the
next generation without having to stop the fabs for two years while
they go back to the drawing board.
A real fix will *hopefully* come later, but unfortunately Intel haven't
even defined the bit in IA32_ARCH_CAPABILITIES which advertises "you
don't have to do any of this shit any more; we fixed it", analogous to
their RDCL_NO bit for "no more Meltdown". I'm *hoping* that's just an
oversight in preparing the doc and not looking far enough ahead, rather
than an actual *intent* to never fix it properly as Linus inferred.
Download attachment "smime.p7s" of type "application/x-pkcs7-signature" (5213 bytes)
Powered by blists - more mailing lists