[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180820212556.GC2230@char.us.oracle.com>
Date: Mon, 20 Aug 2018 17:25:56 -0400
From: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
To: kernel-hardening@...ts.openwall.com,
Liran Alon <liran.alon@...cle.com>,
Deepa Srinivasan <deepa.srinivasan@...cle.com>,
linux-mm@...ck.org, juerg.haefliger@....com,
khalid.aziz@...cle.com, chris.hyser@...cle.com,
tyhicks@...onical.com, dwmw@...zon.co.uk, keescook@...gle.com,
andrew.cooper3@...rix.com, jcm@...hat.com,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Kanth <kanth.ghatraju@...cle.com>,
Joao Martins <joao.m.martins@...cle.com>, jmattson@...gle.com,
pradeep.vincent@...cle.com,
Linus Torvalds <torvalds@...ux-foundation.org>,
ak@...ux.intel.com, john.haxby@...cle.com,
jsteckli@...inf.tu-dresden.de
Cc: linux-kernel@...r.kernel.org, tglx@...utronix.de
Subject: Redoing eXclusive Page Frame Ownership (XPFO) with isolated CPUs in
mind (for KVM to isolate its guests per CPU)
Hi!
See eXclusive Page Frame Ownership (https://lwn.net/Articles/700606/) which was posted
way back in in 2016..
In the last couple of months there has been a slew of CPU issues that have complicated
a lot of things. The latest - L1TF - is still fresh in folks's mind and it is
especially acute to virtualization workloads.
As such a bunch of various folks from different cloud companies (CCed) are looking
at a way to make Linux kernel be more resistant to hardware having these sort of
bugs.
In particular we are looking at a way to "remove as many mappings from the global
kernel address space as possible. Specifically, while being in the
context of process A, memory of process B should not be visible in the
kernel." (email from Julian Stecklina). That is the high-level view and
how this could get done, well, that is why posting this on
LKML/linux-hardening/kvm-devel/linux-mm to start the discussion.
Usually I would start with a draft of RFC patches so folks can rip it apart, but
thanks to other people (Juerg thank you!) it already exists:
(see https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1222756.html)
The idea would be to extend this to:
1) Only do it for processes that run under CPUS which are in isolcpus list.
2) Expand this to be a per-cpu page tables. That is each CPU has its own unique
set of pagetables - naturally _START_KERNEL -> __end would be mapped but the
rest would not.
Thoughts? Is this possible? Crazy? Better ideas?
Powered by blists - more mailing lists