lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <BN9PR11MB543328B13905017E11355AC78CB99@BN9PR11MB5433.namprd11.prod.outlook.com>
Date:   Fri, 15 Oct 2021 01:01:38 +0000
From:   "Tian, Kevin" <kevin.tian@...el.com>
To:     Jason Gunthorpe <jgg@...dia.com>
CC:     Alex Williamson <alex.williamson@...hat.com>,
        "Liu, Yi L" <yi.l.liu@...el.com>, "hch@....de" <hch@....de>,
        "jasowang@...hat.com" <jasowang@...hat.com>,
        "joro@...tes.org" <joro@...tes.org>,
        "jean-philippe@...aro.org" <jean-philippe@...aro.org>,
        "parav@...lanox.com" <parav@...lanox.com>,
        "lkml@...ux.net" <lkml@...ux.net>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>,
        "lushenming@...wei.com" <lushenming@...wei.com>,
        "eric.auger@...hat.com" <eric.auger@...hat.com>,
        "corbet@....net" <corbet@....net>,
        "Raj, Ashok" <ashok.raj@...el.com>,
        "yi.l.liu@...ux.intel.com" <yi.l.liu@...ux.intel.com>,
        "Tian, Jun J" <jun.j.tian@...el.com>, "Wu, Hao" <hao.wu@...el.com>,
        "Jiang, Dave" <dave.jiang@...el.com>,
        "jacob.jun.pan@...ux.intel.com" <jacob.jun.pan@...ux.intel.com>,
        "kwankhede@...dia.com" <kwankhede@...dia.com>,
        "robin.murphy@....com" <robin.murphy@....com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
        "dwmw2@...radead.org" <dwmw2@...radead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
        "david@...son.dropbear.id.au" <david@...son.dropbear.id.au>,
        "nicolinc@...dia.com" <nicolinc@...dia.com>
Subject: RE: [RFC 10/20] iommu/iommufd: Add IOMMU_DEVICE_GET_INFO

> From: Jason Gunthorpe <jgg@...dia.com>
> Sent: Thursday, October 14, 2021 11:43 PM
> 
> > > > I think the key is whether other archs allow driver to decide DMA
> > > > coherency and indirectly the underlying I/O page table format.
> > > > If yes, then I don't see a reason why such decision should not be
> > > > given to userspace for passthrough case.
> > >
> > > The choice all comes down to if the other arches have cache
> > > maintenance instructions in the VM that *don't work*
> >
> > Looks vfio always sets IOMMU_CACHE on all platforms as long as
> > iommu supports it (true on all platforms except intel iommu which
> > is dedicated for GPU):
> >
> > vfio_iommu_type1_attach_group()
> > {
> > 	...
> > 	if (iommu_capable(bus, IOMMU_CAP_CACHE_COHERENCY))
> > 		domain->prot |= IOMMU_CACHE;
> > 	...
> > }
> >
> > Should above be set according to whether a device is coherent?
> 
> For IOMMU_CACHE there are two questions related to the overloaded
> meaning:
> 
>  - Should VFIO ask the IOMMU to use non-coherent DMA (ARM meaning)
>    This depends on how the VFIO user expects to operate the DMA.
>    If the VFIO user can issue cache maintenance ops then IOMMU_CACHE
>    should be controlled by the user. I have no idea what platforms
>    support user space cache maintenance ops.

But just like you said for intel meaning below, even if those ops are
privileged a uAPI can be provided to support such usage if necessary.

> 
>  - Should VFIO ask the IOMMU to suppress no-snoop (Intel meaning)
>    This depends if the VFIO user has access to wbinvd or not.
> 
>    wbinvd is a privileged instruction so normally userspace will not
>    be able to access it.
> 
>    Per Paolo recommendation there should be a uAPI someplace that
>    allows userspace to issue wbinvd - basically the suppress no-snoop
>    is also user controllable.
> 
> The two things are very similar and ultimately are a choice userspace
> should be making.

yes

> 
> From something like a qemu perspective things are more murkey - eg on
> ARM qemu needs to co-ordinate with the guest. Whatever IOMMU_CACHE
> mode VFIO is using must match the device coherent flag in the Linux
> guest. I'm guessing all Linux guest VMs only use coherent DMA for all
> devices today. I don't know if the cache maintaince ops are even
> permitted in an ARM VM.
> 

I'll leave it to Jean to confirm. If only coherent DMA can be used in
the guest on other platforms, suppose VFIO should not blindly set 
IOMMU_CACHE and in concept it should deny assigning a non-coherent 
device since no co-ordination with guest exists today.

So the bottomline is that we'll keep this no-snoop thing Intel-specific. 
For the basic skeleton we'll not support no-snoop thus the user 
needs to set enforce-snoop flag when creating an IOAS like this RFC v1
does. Also need to introduce a new flag instead of abusing 
IOMMU_CACHE in the kernel. For other platforms it may need a fix 
to deny non-coherent device (based on above open) for now.

Thanks
Kevin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ