[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180109214151.GB13282@1wt.eu>
Date: Tue, 9 Jan 2018 22:41:51 +0100
From: Willy Tarreau <w@....eu>
To: Andy Lutomirski <luto@...nel.org>
Cc: Borislav Petkov <bp@...en8.de>,
LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
Brian Gerst <brgerst@...il.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Ingo Molnar <mingo@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Josh Poimboeuf <jpoimboe@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, Kees Cook <keescook@...omium.org>
Subject: Re: [RFC PATCH v2 2/6] x86/arch_prctl: add ARCH_GET_NOPTI and
ARCH_SET_NOPTI to enable/disable PTI
On Tue, Jan 09, 2018 at 01:26:57PM -0800, Andy Lutomirski wrote:
> On Tue, Jan 9, 2018 at 6:54 AM, Willy Tarreau <w@....eu> wrote:
> > On Tue, Jan 09, 2018 at 03:51:57PM +0100, Borislav Petkov wrote:
> >> On Tue, Jan 09, 2018 at 03:36:53PM +0100, Willy Tarreau wrote:
> >> > I see and am not particularly against this, but what use case do you
> >> > have in mind precisely ? I doubt it's just saving a few tens of bytes,
> >> > so probably you're more concerned about the potential risks this opens ?
> >> > But given we only allow this for CAP_SYS_RAWIO and these ones already
> >> > have access to /dev/mem and many other things, don't you think there
> >> > are much easier ways to dump kernel memory in this case than trying to
> >> > inject some meltdown code into the victim process ? Or maybe you have
> >> > other cases in mind that I'm not seeing.
> >>
> >> I'd like this to be config-controllable so that distros can make the
> >> decision whether/if they want to support the whole per-mm thing.
> >
> > OK.
> >
> >> Also, if CAP_SYS_RAWIO is going to protect, please make the
> >> ARCH_GET_NOPTI variant check it too.
> >
> > Interestingly I removed the check consecutive to the discussions. But
> > I think I'll simply remove the whole ARCH_GET_NOPTI as it has no real
> > value beyond initial development.
> >
>
> I've thought about this a bit more. Here are my thoughts:
>
> 1. I don't like it being per-mm. I think it should be a per-thread
> control so that a program can have a thread with PTI that runs
> less-trusted JavaScript and other network threads with PTI off.
Ingo suggested such use case as well. While I'm quite inclined to agree
with it, I'm just thinking, do we really have some processes both I/O
bound and executing Javascript or similar in a thread ? Well, thinking
about it, we have Lua in haproxy, we could imagine having Javascript
later when admins don't want to learn Lua. So that could make sense
(/me takes a sickness bag to throw up).
> Obviously we lose NX protection mm-wide if any threads have PTI off.
> I think the way to implement this is:
>
> Have this in struct mm_context:
>
> bool has_non_pti_thread;
>
> To turn PTI off on a thread:
>
> Take pagetable_lock.
> if (!has_non_pti_thread) {
> context.has_non_pti_thread = true;
> clear the NX bits;
> }
> drop pagetable_lock;
> set the TI flag;
Linus suggested that we refuse to turn off PTI if any thread was already
created and I really agree with this, and it's not incompatible with
what you have above. We could just turn it on again for certain threads.
> Fork clears the per-mm flag in the new mm. Exec clears it, too. I
> think that's all that's needed. Newly created threads always have PTI
> on.
Fork doesn't clear (exec indeed does). Fork clearing it would be
problematic as it would mean you can't do it on a deamon during startup.
> To turn PTI back on, just clear the TI flag.
>
> 2.Turning off PTI is, in general, a terrible idea. It totally breaks
> any semblance of a security model on a Meltdown-affected CPU.
Absolutely, but it recovers what matters more in *certain* workloads,
which is performance.
> So I
> think we should require CAP_SYS_RAWIO *and* that the system is booted
> with pti=allow_optout or something like that.
I'm really not fan of this. 1) it would require to reboot during the
peak hour to try to fix the problem. 2) the flag will end up being
deployed everywhere by default in environments flirting with performance
"just in case" so it will be rendered useless.
I'm fine with Boris' requirement that the kernel should be build with
the appropriate option to support this. If you're doing your own builds,
you can well take care of having the appropriate options (PTI+the right
to turn it off) and deploy such kernels where relevant.
Willy
Powered by blists - more mailing lists