linux-kernel - RE: Plan for /dev/ioasid RFC v2

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <MWHPR11MB188674B8CF9690037129236C8C079@MWHPR11MB1886.namprd11.prod.outlook.com>
Date:   Thu, 24 Jun 2021 05:59:40 +0000
From:   "Tian, Kevin" <kevin.tian@...el.com>
To:     David Gibson <david@...son.dropbear.id.au>
CC:     "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        Jason Wang <jasowang@...hat.com>,
        Kirti Wankhede <kwankhede@...dia.com>,
        "Jean-Philippe Brucker" <jean-philippe@...aro.org>,
        "Jiang, Dave" <dave.jiang@...el.com>,
        "Raj, Ashok" <ashok.raj@...el.com>,
        Jonathan Corbet <corbet@....net>,
        "Jason Gunthorpe" <jgg@...dia.com>,
        "parav@...lanox.com" <parav@...lanox.com>,
        "Alex Williamson" <alex.williamson@...hat.com>,
        "Enrico Weigelt, metux IT consult" <lkml@...ux.net>,
        Robin Murphy <robin.murphy@....com>,
        LKML <linux-kernel@...r.kernel.org>,
        Shenming Lu <lushenming@...wei.com>,
        "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
        "Paolo Bonzini" <pbonzini@...hat.com>,
        David Woodhouse <dwmw2@...radead.org>
Subject: RE: Plan for /dev/ioasid RFC v2

> From: David Gibson
> Sent: Thursday, June 24, 2021 12:26 PM
> 
> On Fri, Jun 18, 2021 at 04:57:40PM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@...dia.com>
> > > Sent: Friday, June 18, 2021 8:20 AM
> > >
> > > On Thu, Jun 17, 2021 at 03:14:52PM -0600, Alex Williamson wrote:
> > >
> > > > I've referred to this as a limitation of type1, that we can't put
> > > > devices within the same group into different address spaces, such as
> > > > behind separate vRoot-Ports in a vIOMMU config, but really, who cares?
> > > > As isolation support improves we see fewer multi-device groups, this
> > > > scenario becomes the exception.  Buy better hardware to use the
> devices
> > > > independently.
> > >
> > > This is basically my thinking too, but my conclusion is that we should
> > > not continue to make groups central to the API.
> > >
> > > As I've explained to David this is actually causing functional
> > > problems and mess - and I don't see a clean way to keep groups central
> > > but still have the device in control of what is happening. We need
> > > this device <-> iommu connection to be direct to robustly model all
> > > the things that are in the RFC.
> > >
> > > To keep groups central someone needs to sketch out how to solve
> > > today's mdev SW page table and mdev PASID issues in a clean
> > > way. Device centric is my suggestion on how to make it clean, but I
> > > haven't heard an alternative??
> > >
> > > So, I view the purpose of this discussion to scope out what a
> > > device-centric world looks like and then if we can securely fit in the
> > > legacy non-isolated world on top of that clean future oriented
> > > API. Then decide if it is work worth doing or not.
> > >
> > > To my mind it looks like it is not so bad, granted not every detail is
> > > clear, and no code has be sketched, but I don't see a big scary
> > > blocker emerging. An extra ioctl or two, some special logic that
> > > activates for >1 device groups that looks a lot like VFIO's current
> > > logic..
> > >
> > > At some level I would be perfectly fine if we made the group FD part
> > > of the API for >1 device groups - except that complexifies every user
> > > space implementation to deal with that. It doesn't feel like a good
> > > trade off.
> > >
> >
> > Would it be an acceptable tradeoff by leaving >1 device groups
> > supported only via legacy VFIO (which is anyway kept for backward
> > compatibility), if we think such scenario is being deprecated over
> > time (thus little value to add new features on it)? Then all new
> > sub-systems including vdpa and new vfio only support singleton
> > device group via /dev/iommu...
> 
> The case that worries me here is if you *thought* you had 1 device
> groups, but then discover a hardware bug which means two things aren't
> as isolated as you thought they were.  What do you do then?

I didn't get your point. If such hardware bug leaves two associated
devices in separate groups, what can software do? Even with existing
VFIO mechanism they can be attached to different containers before
the bug is identified since the kernel thinks they are isolated. If the 
after-fact mitigation is to kill the VM and then force two devices 
attached to a single VFIO container (after such hardware bug is identified), 
same mitigation can be applied here i.e. the user should fall back to 
legacy VFIO instead of attempting to use new /dev/iommu for such 
devices...

Thanks
Kevin