[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y9FPzD/kpL4sI/Qa@work-vm>
Date: Wed, 25 Jan 2023 15:50:36 +0000
From: "Dr. David Alan Gilbert" <dgilbert@...hat.com>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc: "Reshetova, Elena" <elena.reshetova@...el.com>,
"Shishkin, Alexander" <alexander.shishkin@...el.com>,
"Shutemov, Kirill" <kirill.shutemov@...el.com>,
"Kuppuswamy, Sathyanarayanan" <sathyanarayanan.kuppuswamy@...el.com>,
"Kleen, Andi" <andi.kleen@...el.com>,
"Hansen, Dave" <dave.hansen@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
"Wunner, Lukas" <lukas.wunner@...el.com>,
Mika Westerberg <mika.westerberg@...ux.intel.com>,
"Michael S. Tsirkin" <mst@...hat.com>,
Jason Wang <jasowang@...hat.com>,
"Poimboe, Josh" <jpoimboe@...hat.com>,
"aarcange@...hat.com" <aarcange@...hat.com>,
Cfir Cohen <cfir@...gle.com>, Marc Orr <marcorr@...gle.com>,
"jbachmann@...gle.com" <jbachmann@...gle.com>,
"pgonda@...gle.com" <pgonda@...gle.com>,
"keescook@...omium.org" <keescook@...omium.org>,
James Morris <jmorris@...ei.org>,
Michael Kelley <mikelley@...rosoft.com>,
"Lange, Jon" <jlange@...rosoft.com>,
"linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux guest kernel threat model for Confidential Computing
* Greg Kroah-Hartman (gregkh@...uxfoundation.org) wrote:
> On Wed, Jan 25, 2023 at 02:57:40PM +0000, Dr. David Alan Gilbert wrote:
> > * Greg Kroah-Hartman (gregkh@...uxfoundation.org) wrote:
> > > On Wed, Jan 25, 2023 at 01:42:53PM +0000, Dr. David Alan Gilbert wrote:
> > > > * Greg Kroah-Hartman (gregkh@...uxfoundation.org) wrote:
> > > > > On Wed, Jan 25, 2023 at 12:28:13PM +0000, Reshetova, Elena wrote:
> > > > > > Hi Greg,
> > > > > >
> > > > > > You mentioned couple of times (last time in this recent thread:
> > > > > > https://lore.kernel.org/all/Y80WtujnO7kfduAZ@kroah.com/) that we ought to start
> > > > > > discussing the updated threat model for kernel, so this email is a start in this direction.
> > > > >
> > > > > Any specific reason you didn't cc: the linux-hardening mailing list?
> > > > > This seems to be in their area as well, right?
> > > > >
> > > > > > As we have shared before in various lkml threads/conference presentations
> > > > > > ([1], [2], [3] and many others), for the Confidential Computing guest kernel, we have a
> > > > > > change in the threat model where guest kernel doesn’t anymore trust the hypervisor.
> > > > >
> > > > > That is, frankly, a very funny threat model. How realistic is it really
> > > > > given all of the other ways that a hypervisor can mess with a guest?
> > > >
> > > > It's what a lot of people would like; in the early attempts it was easy
> > > > to defeat, but in TDX and SEV-SNP the hypervisor has a lot less that it
> > > > can mess with - remember that not just the memory is encrypted, so is
> > > > the register state, and the guest gets to see changes to mapping and a
> > > > lot of control over interrupt injection etc.
> > >
> > > And due to the fact that SEV and TDX really do not work, how is anyone
> > > expecting any of this to work? As one heckler on IRC recently put it,
> > > if you squint hard enough, you can kind of ignore the real-world issues
> > > here, so perhaps this should all be called "squint-puting" in order to
> > > feel like you have a "confidential" system? :)
> >
> > I agree the original SEV was that weak; I've not seen anyone give a good
> > argument against SNP or TDX.
>
> Argument that it doesn't work? I thought that ship sailed a long time
> ago but I could be wrong as I don't really pay attention to that stuff
> as it's just vaporware :)
You're being unfair claiming it's vaporware. You can go out and buy SNP
hardware now (for over a year), the patches are on list and under review
(and have been for quite a while).
If you're claiming it doesn't, please justify it.
> > > > > So what do you actually trust here? The CPU? A device? Nothing?
> > > >
> > > > We trust the actual physical CPU, provided that it can prove that it's a
> > > > real CPU with the CoCo hardware enabled.
> > >
> > > Great, so why not have hardware attestation also for your devices you
> > > wish to talk to? Why not use that as well? Then you don't have to
> > > worry about anything in the guest.
> >
> > There were some talks at Plumbers where PCIe is working on adding that;
> > it's not there yet though. I think that's PCIe 'Integrity and Data
> > Encryption' (IDE - sigh), and PCIe 'Security Prtocol and Data Model' -
> > SPDM. I don't know much of the detail of those, just that they're far
> > enough off that people aren't depending on them yet.
>
> Then work with those groups to implement that in an industry-wide way
> and then take advantage of it by adding support for it to Linux! Don't
> try to reinvent the same thing in a totally different way please.
Sure, people are working with them; but those are going to take time
and people want to use existing PCIe devices; and given that the hosts
are available that seems reasonable.
> > > > Both the SNP and TDX hardware
> > > > can perform an attestation signed by the CPU to prove to someone
> > > > external that the guest is running on a real trusted CPU.
> > >
> > > And again, do the same thing for the other hardware devices and all is
> > > good. To not do that is to just guess and wave hands. You know this :)
> >
> > That wouldn't help you necessarily for virtual devices - where the
> > hypervisor implements the device (like a virtual NIC).
>
> Then create a new bus for that if you don't trust the virtio bus today.
It's not that I distrust the virtio bus - just that we need to make sure
it's implementation is pessimistic enough for CoCo.
> > > > > I hate the term "hardening". Please just say it for what it really is,
> > > > > "fixing bugs to handle broken hardware". We've done that for years when
> > > > > dealing with PCI and USB and even CPUs doing things that they shouldn't
> > > > > be doing. How is this any different in the end?
> > > > >
> > > > > So what you also are saying here now is "we do not trust any PCI
> > > > > devices", so please just say that (why do you trust USB devices?) If
> > > > > that is something that you all think that Linux should support, then
> > > > > let's go from there.
> > > >
> > > > I don't think generally all PCI device drivers guard against all the
> > > > nasty things that a broken implementation of their hardware can do.
> > >
> > > I know that all PCI drivers can NOT do that today as that was never
> > > anything that Linux was designed for.
> >
> > Agreed; which again is why I only really worry about the subset of
> > devices I'd want in a CoCo VM.
>
> Everyone wants a subset, different from other's subset, which means you
> need them all. Sorry.
I think for CoCo the subset is fairly small, even including all the
people discussing it. It's the virtual devices, and a few of their
favourite physical devices, but a fairly small subset.
> > > > The USB devices are probably a bit better, because they actually worry
> > > > about people walking up with a nasty HID device; I'm skeptical that
> > > > a kernel would survive a purposely broken USB controller.
> > >
> > > I agree with you there, USB drivers are only starting to be fuzzed at
> > > the descriptor level, that's all. Which is why they too can be put into
> > > the "untrusted" area until you trust them.
> > >
> > > > I'm not sure the request here isn't really to make sure *all* PCI devices
> > > > are safe; just the ones we care about in a CoCo guest (e.g. the virtual devices) -
> > > > and potentially ones that people will want to pass-through (which
> > > > generally needs a lot more work to make safe).
> > > > (I've not looked at these Intel tools to see what they cover)
> > >
> > > Why not just create a whole new bus path for these "trusted" devices to
> > > attach to and do that instead of tyring to emulate a protocol that was
> > > explicitly designed NOT to this model at all? Why are you trying to
> > > shoehorn something here and not just designing it properly from the
> > > beginning?
> >
> > I'd be kind of OK with that for the virtual devices; but:
> >
> > a) I think you'd start reinventing PCIe with enumeration etc
>
> Great, then work with the PCI group as talked about above to solve it
> properly and not do whack-a-mole like seems to be happening so far.
>
> > b) We do want those pass through NICs etc that are PCIe
> > - as long as you use normal guest crypto stuff then the host
> > can be just as nasty as it likes with the data they present.
>
> Great, work with the PCI spec for verified devices.
>
> > c) The world has enough bus protocols, and people understand the
> > basics of PCI(e) - we really don't need another one.
>
> Great, work with the PCI spec people please.
As I say above; all happening - but it's going to take years.
It's wrong to leave users with less secure solutions if there are simple
fixes available. I agree that if it involves major pain all over then
I can see your dislike - but if it's small fixes then what's the
problem?
> > > > Having said that, how happy are you with Thunderbolt PCI devices being
> > > > plugged into your laptop or into the hotplug NVMe slot on a server?
> > >
> > > We have protection for that, and have had it for many years. Same for
> > > USB devices. This isn't new, perhaps you all have not noticed those
> > > features be added and taken advantage of already by many Linux distros
> > > and system images (i.e. ChromeOS and embedded systems?)
> >
> > What protection? I know we have an IOMMU, and that stops the device
> > stamping all over RAM by itself - but I think Intel's worries are more
> > subtle, things where the device starts playing with what PCI devices
> > are expected to do to try and trigger untested kernel paths. I don't
> > think there's protection against that.
> > I know we can lock by PCI/USB vendor/device ID - but those can be made
> > up trivially; protection like that is meaningless.
>
> Then combine it with device attestation and you have a solved solution,
> don't ignore others working on this please.
>
> > > > We're now in the position we were with random USB devices years ago.
> > >
> > > Nope, we are not, again, we already handle random PCI devices being
> > > plugged in. It's up to userspace to make the policy decision if it
> > > should be trusted or not before the kernel has access to it.
> > >
> > > So a meta-comment, why not just use that today? If your guest OS can
> > > not authenticate the PCI device passed to it, don't allow the kernel to
> > > bind to it. If it can be authenticated, wonderful, bind away! You can
> > > do this today with no kernel changes needed.
> >
> > Because:
> > a) there's no good way to authenticate a PCI device yet
> > - any nasty device can claim to have a given PCI ID.
> > b) Even if you could, there's no man-in-the-middle protection yet.
>
> Where is the "man" here in the middle of?
I'm worried what a malicious hypervisor could do.
> And any PCI attestation should handle that, if not, work with them to
> solve that please.
I believe the two mechanisms I mentioned above would handle that; when
it eventually gets there.
> Thunderbolt has authenticated device support today, and so does PCI, and
> USB has had it for a decade or so. Use the in-kernel implementation
> that we already have or again, show us where it is lacking and we will
> be glad to take patches to cover the holes (as we did last year when
> ChromeOS implemented support for it in their userspace.)
I'd appreciate pointers to the implementations you're referring to.
> > > > Also we would want to make sure that any config data that the hypervisor
> > > > can pass to the guest is validated.
> > >
> > > Define "validated" please.
> >
> > Lets say you get something like a ACPI table or qemu fw.cfg table
> > giving details of your devices; if the hypervisor builds those in a
> > nasty way what happens?
>
> You tell me, as we trust ACPI tables today, and if we can not, again
> then you need to change the model of what Linux does. Why isn't the
> BIOS authentication path working properly for ACPI tables already today?
> I thought that was a long-solved problem with UEFI (if not, I'm sure the
> UEFI people would be interested.)
If it's part of the BIOS image that's measured/loaded during startup
then we're fine; if it's a table dynamically generated by the hypervisor
I'm more worried.
> Anyway, I'll wait until I see real patches as this thread seems to be
> totally vague and ignores our current best-practices for pluggable
> devices for some odd reason.
Please point people at those best practices rather than just ranting
about how pointless you feel all this is!
The patches here from Intel are a TOOL to find problems; I can't see the
objections to having a tool like this.
(I suspect some of these fixes might make the kernel a bit more robust
against unexpected hot-remove of PCIe devices as well; but that's more
of a guess)
Dave
> thanks,
>
> greg k-h
>
--
Dr. David Alan Gilbert / dgilbert@...hat.com / Manchester, UK
Powered by blists - more mailing lists