lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210602224536.GJ1002214@nvidia.com>
Date:   Wed, 2 Jun 2021 19:45:36 -0300
From:   Jason Gunthorpe <jgg@...dia.com>
To:     Alex Williamson <alex.williamson@...hat.com>
Cc:     "Tian, Kevin" <kevin.tian@...el.com>,
        Jean-Philippe Brucker <jean-philippe@...aro.org>,
        "Jiang, Dave" <dave.jiang@...el.com>,
        "Raj, Ashok" <ashok.raj@...el.com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        Jonathan Corbet <corbet@....net>,
        Robin Murphy <robin.murphy@....com>,
        LKML <linux-kernel@...r.kernel.org>,
        "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
        David Gibson <david@...son.dropbear.id.au>,
        Kirti Wankhede <kwankhede@...dia.com>,
        David Woodhouse <dwmw2@...radead.org>,
        Jason Wang <jasowang@...hat.com>
Subject: Re: [RFC] /dev/ioasid uAPI proposal

On Wed, Jun 02, 2021 at 02:37:34PM -0600, Alex Williamson wrote:

> Right.  I don't follow where you're jumping to relaying DMA_PTE_SNP
> from the guest page table... what page table?  

I see my confusion now, the phrasing in your earlier remark led me
think this was about allowing the no-snoop performance enhancement in
some restricted way.

It is really about blocking no-snoop 100% of the time and then
disabling the dangerous wbinvd when the block is successful.

Didn't closely read the kvm code :\

If it was about allowing the optimization then I'd expect the guest to
enable no-snoopable regions via it's vIOMMU and realize them to the
hypervisor and plumb the whole thing through. Hence my remark about
the guest page tables..

So really the test is just 'were we able to block it' ?

> This support existed before mdev, IIRC we needed it for direct
> assignment of NVIDIA GPUs.

Probably because they ignored the disable no-snoop bits in the control
block, or reset them in some insane way to "fix" broken bioses and
kept using it even though by all rights qemu would have tried hard to
turn it off via the config space. Processing no-snoop without a
working wbinvd would be fatal. Yeesh

But Ok, back the /dev/ioasid. This answers a few lingering questions I
had..

1) Mixing IOMMU_CAP_CACHE_COHERENCY and !IOMMU_CAP_CACHE_COHERENCY
   domains.

   This doesn't actually matter. If you mix them together then kvm
   will turn on wbinvd anyhow, so we don't need to use the DMA_PTE_SNP
   anywhere in this VM.

   This if two IOMMU's are joined together into a single /dev/ioasid
   then we can just make them both pretend to be
   !IOMMU_CAP_CACHE_COHERENCY and both not set IOMMU_CACHE.

2) How to fit this part of kvm in some new /dev/ioasid world

   What we want to do here is iterate over every ioasid associated
   with the group fd that is passed into kvm.

   Today the group fd has a single container which specifies the
   single ioasid so this is being done trivially.

   To reorg we want to get the ioasid from the device not the
   group (see my note to David about the groups vs device rational)

   This is just iterating over each vfio_device in the group and
   querying the ioasid it is using.

   Or perhaps more directly: an op attaching the vfio_device to the
   kvm and having some simple helper 
         '(un)register ioasid with kvm (kvm, ioasid)'
   that the vfio_device driver can call that just sorts this out.

   It is not terrible..

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ