lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB527661C48959F977AC3594438C709@BN9PR11MB5276.namprd11.prod.outlook.com>
Date:   Thu, 9 Dec 2021 12:31:05 +0000
From:   "Tian, Kevin" <kevin.tian@...el.com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        "Jiang, Dave" <dave.jiang@...el.com>,
        Jason Gunthorpe <jgg@...dia.com>
CC:     Logan Gunthorpe <logang@...tatee.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Bjorn Helgaas <helgaas@...nel.org>,
        "Marc Zygnier" <maz@...nel.org>,
        Alex Williamson <alex.williamson@...hat.com>,
        "Dey, Megha" <megha.dey@...el.com>,
        "Raj, Ashok" <ashok.raj@...el.com>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Jon Mason <jdmason@...zu.us>, Allen Hubbe <allenbh@...il.com>,
        "linux-ntb@...glegroups.com" <linux-ntb@...glegroups.com>,
        "linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>,
        Heiko Carstens <hca@...ux.ibm.com>,
        "Christian Borntraeger" <borntraeger@...ibm.com>,
        "x86@...nel.org" <x86@...nel.org>, Joerg Roedel <jroedel@...e.de>,
        "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>
Subject: RE: [patch 21/32] NTB/msi: Convert to msi_on_each_desc()

> From: Thomas Gleixner <tglx@...utronix.de>
> Sent: Thursday, December 9, 2021 4:37 PM
> 
> On Thu, Dec 09 2021 at 05:23, Kevin Tian wrote:
> >> From: Thomas Gleixner <tglx@...utronix.de>
> >> I don't see anything wrong with that. A subdevice is it's own entity and
> >> VFIO can chose the most conveniant representation of it to the guest
> >> obviously.
> >>
> >> How that is backed on the host does not really matter. You can expose
> >> MSI-X to the guest with a INTx backing as well.
> >>
> >
> > Agree with this point. How the interrupts are represented to the guest
> > is orthogonal to how the backend resource is allocated. Physically MSI-X
> > and IMS can be enabled simultaneously on an IDXD device. Once
> > dynamic allocation is allowed for both, either one can be allocated for
> > a subdevice (with only difference on supported #subdevices).
> >
> > When an interrupt resource is exposed to the guest with the same type
> > (e.g. MSI-on-MSI or IMS-on-IMS), it can be also passed through to the
> > guest as long as a hypercall machinery is in place to get addr/data pair
> > from the host (as you suggested earlier).
> 
> As I pointed out in the conclusion of this thread, IMS is only going to
> be supported with interrupt remapping in place on both host and guest.

I still need to read the last few mails but thanks for pointing it out now.

> 
> As these devices are requiring a vIOMMU on the guest anyway (PASID, User
> IO page tables), the required hypercalls are part of the vIOMMU/IR
> implementation. If you look at it from the irqdomain hierarchy view:
> 
>                          |- PCI-MSI
>   VECTOR -- [v]IOMMU/IR -|- PCI-MSI-X
>                          |- PCI-IMS
> 
> So host and guest use just the same representation which makes a ton of
> sense.
> 
> There are two places where this matters:
> 
>   1) The activate() callback of the IR domain
> 
>   2) The irq_set_affinity() callback of the irqchip associated with the
>      IR domain
> 
> Both callbacks are allowed to fail and the error code is handed back to
> the originating call site.
> 
> If you look at the above hierarchy view then MSI/MSI-X/IMS are all
> treated in exactly the same way. It all becomes the common case.
> 
> No?
> 

Yes, I think above makes sense. 

For a new guest OS which supports this enlightened hierarchy the same
machinery works for all type of interrupt storages and we have a
failure path from host to guest in case of host-side resource shortage.
And no trap is required on guest access to the interrupt storage.

A legacy guest OS which doesn't support the enlightened hierarchy
can only use MSI/MSI-X which is still trapped. But with vector 
reallocation support from your work the situation already improves 
a lot than current awkward way in VFIO (free all previous vectors 
and then re-allocate).

Overall I think this is a good modeling.

Thanks
Kevin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ