linux-kernel - Re: [PATCH v3 7/9] vfio: Use driver_override to avert binding to compromising drivers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170713082314.GA5921@kroah.com>
Date:   Thu, 13 Jul 2017 10:23:14 +0200
From:   Greg KH <greg@...ah.com>
To:     Alex Williamson <alex.williamson@...hat.com>
Cc:     Russell King - ARM Linux <linux@...linux.org.uk>,
        kvm@...r.kernel.org, eric.auger@...hat.com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 7/9] vfio: Use driver_override to avert binding to
 compromising drivers

On Tue, Jul 11, 2017 at 10:41:16AM -0600, Alex Williamson wrote:
> Let me give a concrete scenario, I have a dual-port conventional PCI
> e1000 NIC.  The IOMMU operates on PCIe requester IDs and therefore both
> NIC functions are masked behind the requester ID of a PCIe-to-PCI
> bridge.  We cannot have the e1000 driver managing one function and a
> user managing the other (via vfio-pci).

Agreed, but really, if a user asks to do such a thing, they deserve the
pieces the kernel ends up in, right?

> In this case, not only is the
> DMA not isolated but the functions share the same IOMMU context.
> Therefore in order to allow the user access to one function via
> vfio-pci, the other function needs to be in a known state, either also
> bound to vfio-pci, bound to an innocuous driver like pci-stub, or
> unbound from any driver.  Given this state, user now has access to one
> function of the device, but how can we fix our driver to manage the
> other function?

We have USB drivers that do this all the time, due to crazy USB specs
that required it.  The cdc-acm driver is one example, and I think there
are a number of USB sound devices that also have this issue.  Just have
the driver of the "first" device grab the second one as well and "know"
about the resources involved, as you are doing today.

But, then you somehow seem to have the requirement to prevent userspace
from mucking around in your driver bindings, and really, you shouldn't
care about this, because again, if it messes up here, all bets are off.

> If the other function is also bound to vfio-pci, the driver core does
> not allow us to refuse a driver remove request, the best we can do is
> block for a while, but we best not do that too long so we end up in the
> device unbound state.
> 
> Likewise, if the other function was bound to pci-stub, this driver won't
> block remove, so the device for the other port can transition to an
> unbound state.
> 
> Once in an unbound state, how would fixing either the vfio-pci or the
> core vfio driver prevent the scenario which can now happen of the
> unbound device being bound to the host e1000 driver?  This can happen
> in pure PCIe topologies as well where perhaps the IOMMU context is not
> shared, but the devices still lack DMA isolation within the group.
> 
> The only tool we currently have to manage this scenario is that the
> vfio core driver can pull BUG_ON after the fact of the other device
> being bound to a host driver.

Well, how about just locking the device down, don't crash the kernel.
That at least gives userspace a chance to figure out what they did was
wrong.

> Understandably, users aren't so keen on
> this, which is why I'm trying to allow vfio to block binding of that
> other device before it happens.  None of this really seems to fall
> within the capabilities of the existing driver core, so simply fixing
> my driver doesn't seem to be a well defined option.  Is there a simple
> solution I'm missing?  We're not concerned only with auto-probing, we
> need to protect against explicit bind attempts as well.

Again, if userspace does an explicit bind/unbind attempt, it _HAS_ to
know what it is doing, we can't protect ourselves from that, that's
always been the case.  The bind/unbind was done as a way for people to
say "I know more about what is going on than the kernel does right now,
so I'm going to override it." and we have to trust that they do know
that.  Don't spend a lot of time and energy trying to protect yourself
from that please.

thanks,

greg k-h