linux-kernel - Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1471306391.12231.137.camel@kernel.crashing.org>
Date:	Tue, 16 Aug 2016 10:13:11 +1000
From:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
To:	"Rustad, Mark D" <mark.d.rustad@...el.com>
Cc:	Alex Williamson <alex.williamson@...hat.com>,
	Alexander Duyck <alexander.duyck@...il.com>,
	Alexey Kardashevskiy <aik@...abs.ru>,
	Bjorn Helgaas <helgaas@...nel.org>,
	Hannes Reinecke <hare@...e.de>,
	"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Babu Moger <babu.moger@...cle.com>,
	Paul Mackerras <paulus@...ba.org>,
	"santosh@...lsio.com" <santosh@...lsio.com>,
	Netdev <netdev@...r.kernel.org>
Subject: Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access

On Mon, 2016-08-15 at 23:16 +0000, Rustad, Mark D wrote:
> 
> Bugs in existing guests is an interesting case, but I have been focused on  
> getting acceptable behavior from a properly functioning guest, in the face  
> of hardware issues that can only be resolved in a single place.
> 
> I agree that a malicious guest can cause all kinds of havoc with  
> directly-assigned devices. Consider a 4-port PHY chip on a shared MDIO bus,  
> for instance. There is really nothing to be done about the potential for  
> mischief with that kind of thing.
> 
> The VPD problem that I had been concerned about arises from a bad design in  
> the PCI spec together with implementations that share the registers across  
> functions. The hardware isn't going to change and I really doubt that the  
> spec will either, so we address it the only place we can.

Right but as I mentioned, there are plenty of holes when it comes to
having multi function devices assigned to different guests, this is just
one of them.

Now, that being said, "working around" the bug to make the "non fully secure"
case work (in my example case the "desktop user") is probably an ok workaround
as long as we fully agree that this is all it is, it by no means provide
actual isolation or security.
 
> >   >   rtain that we agree that not everything can or should be addressed  
> in vfio. I did not mean to suggest it should try to address everything, but  
> I think it should make it possible for correctly behaving guests to work. I  
> think that is not unreasonable.

Again as long as there is no expectation of security here, such as a data
center giving PCI access to some devices.

> > Perhaps the VPD range check should really just have been implemented for  
> the sysfs interface, and left the vfio case unchecked. I don't know because  
> I was not involved in that issue. Perhaps someone more intimately involved  
> can comment on that notion.

That would have been my preferred approach... I think VFIO tries to do too much
which complicates things, causes other bugs, without briging actual safety. I
don't think it should stand in between a guest driver and its device unless
absolutely necessary to provide the functionality due to broken HW or design,
but with the full understanding that doing so remains unsafe from an isolation
standpoint.

> > > Assuming that a device coming back from a guest is functional and not
> > completely broken and can be re-used without a full PERST or power cycle
> > is a wrong assumption. It may or may not work, no amount of "filtering"
> > will fix the fundamental issue. If your HW won't give you access to PERST
> > well ... blame Intel for not specifying a standard way to generate it in
> > the first place :-)
> 
> Yeah, I worry about the state that a malicious guest could leave a device  
> in, but I consider direct assignment always risky anyway. I would just like  
> it to at least work in the non-malicious guest cases.

Right. Only SR-IOV which is somewhat designed for assignment is reasonably safe
in the general case.

On server POWER boxes, we have isolation at the bus level with usual per-slot
PERST control so we are in a much better situation but we also for all the
above reasons, only allow a slot granularity for pass-through.

> > I guess my previous response was really just too terse, I was just focused  
> on unavoidable hangs and data corruption, which even were happening without  
> any guest involvement. For me, guests were just an additional exposure of  
> the same underlying issue.
> 
> With hindsight, it is easy to see that a standard reset would now be a  
> pretty useful thing. I am sure that even if it existed, we would now have  
> lots and lots of quirks around it as well! :-)

Hehe yes, well, HW for you ...

Cheers,
Ben.

> > --
> Mark Rustad, Networking Division, Intel Corporation