lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 21 May 2019 17:33:16 +0200 From: Pierre Morel <pmorel@...ux.ibm.com> To: Alex Williamson <alex.williamson@...hat.com> Cc: Cornelia Huck <cohuck@...hat.com>, sebott@...ux.vnet.ibm.com, gerald.schaefer@...ibm.com, pasic@...ux.vnet.ibm.com, borntraeger@...ibm.com, walling@...ux.ibm.com, linux-s390@...r.kernel.org, iommu@...ts.linux-foundation.org, joro@...tes.org, linux-kernel@...r.kernel.org, kvm@...r.kernel.org, schwidefsky@...ibm.com, heiko.carstens@...ibm.com, robin.murphy@....com Subject: Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES On 21/05/2019 16:59, Alex Williamson wrote: > On Tue, 21 May 2019 11:14:38 +0200 > Pierre Morel <pmorel@...ux.ibm.com> wrote: > >> On 20/05/2019 20:23, Alex Williamson wrote: >>> On Mon, 20 May 2019 18:31:08 +0200 >>> Pierre Morel <pmorel@...ux.ibm.com> wrote: >>> >>>> On 20/05/2019 16:27, Cornelia Huck wrote: >>>>> On Mon, 20 May 2019 13:19:23 +0200 >>>>> Pierre Morel <pmorel@...ux.ibm.com> wrote: >>>>> >>>>>> On 17/05/2019 20:04, Pierre Morel wrote: >>>>>>> On 17/05/2019 18:41, Alex Williamson wrote: >>>>>>>> On Fri, 17 May 2019 18:16:50 +0200 >>>>>>>> Pierre Morel <pmorel@...ux.ibm.com> wrote: >>>>>>>> >>>>>>>>> We implement the capability interface for VFIO_IOMMU_GET_INFO. >>>>>>>>> >>>>>>>>> When calling the ioctl, the user must specify >>>>>>>>> VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and >>>>>>>>> must check in the answer if capabilities are supported. >>>>>>>>> >>>>>>>>> The iommu get_attr callback will be used to retrieve the specific >>>>>>>>> attributes and fill the capabilities. >>>>>>>>> >>>>>>>>> Currently two Z-PCI specific capabilities will be queried and >>>>>>>>> filled by the underlying Z specific s390_iommu: >>>>>>>>> VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes >>>>>>>>> and >>>>>>>>> VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group. >>>>>>>>> >>>>>>>>> Other architectures may add new capabilities in the same way >>>>>>>>> after enhancing the architecture specific IOMMU driver. >>>>>>>>> >>>>>>>>> Signed-off-by: Pierre Morel <pmorel@...ux.ibm.com> >>>>>>>>> --- >>>>>>>>> drivers/vfio/vfio_iommu_type1.c | 122 >>>>>>>>> +++++++++++++++++++++++++++++++++++++++- >>>>>>>>> 1 file changed, 121 insertions(+), 1 deletion(-) >>>>>>>>> >>>>>>>>> diff --git a/drivers/vfio/vfio_iommu_type1.c >>>>>>>>> b/drivers/vfio/vfio_iommu_type1.c >>>>>>>>> index d0f731c..9435647 100644 >>>>>>>>> --- a/drivers/vfio/vfio_iommu_type1.c >>>>>>>>> +++ b/drivers/vfio/vfio_iommu_type1.c >>>>>>>>> @@ -1658,6 +1658,97 @@ static int >>>>>>>>> vfio_domains_have_iommu_cache(struct vfio_iommu *iommu) >>>>>>>>> return ret; >>>>>>>>> } >>>>>>>>> +static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain, >>>>>>>>> + struct vfio_info_cap *caps, size_t size) >>>>>>>>> +{ >>>>>>>>> + struct vfio_iommu_type1_info_pcifn *info_fn; >>>>>>>>> + int ret; >>>>>>>>> + >>>>>>>>> + info_fn = kzalloc(size, GFP_KERNEL); >>>>>>>>> + if (!info_fn) >>>>>>>>> + return -ENOMEM; >>>>>>>>> + >>>>>>>>> + ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN, >>>>>>>>> + &info_fn->response); >>>>>>>> >>>>>>>> What ensures that the 'struct clp_rsp_query_pci' returned from this >>>>>>>> get_attr remains consistent with a 'struct vfio_iommu_pci_function'? >>>>>>>> Why does the latter contains so many reserved fields (beyond simply >>>>>>>> alignment) for a user API? What fields of these structures are >>>>>>>> actually useful to userspace? Should any fields not be exposed to the >>>>>>>> user? Aren't BAR sizes redundant to what's available through the vfio >>>>>>>> PCI API? I'm afraid that simply redefining an internal structure as >>>>>>>> the API leaves a lot to be desired too. Thanks, >>>>>>>> >>>>>>>> Alex >>>>>>>> >>>>>>> Hi Alex, >>>>>>> >>>>>>> I simply used the structure returned by the firmware to be sure to be >>>>>>> consistent with future evolutions and facilitate the copy from CLP and >>>>>>> to userland. >>>>>>> >>>>>>> If you prefer, and I understand that this is the case, I can define a >>>>>>> specific VFIO_IOMMU structure with only the fields relevant to the user, >>>>>>> leaving future enhancement of the user's interface being implemented in >>>>>>> another kernel patch when the time has come. >>> >>> TBH, I had no idea that CLP is an s390 firmware interface and this is >>> just dumping that to userspace. The cover letter says: >>> >>> Using the PCI VFIO interface allows userland, a.k.a. QEMU, to >>> retrieve ZPCI specific information without knowing Z specific >>> identifiers like the function ID or the function handle of the zPCI >>> function hidden behind the PCI interface. >>> >>> But what does this allow userland to do and what specific pieces of >>> information do they need? We do have a case already where Intel >>> graphics devices have a table (OpRegion) living in host system memory >>> that we expose via a vfio region, so it wouldn't be unprecedented to do >>> something like this, but as Connie suggests, if we knew what was being >>> consumed here and why, maybe we could generalize it into something >>> useful for others. >> >> OK, sorry I try to explain better. >> >> 1) A short description, of zPCI functions and groups >> >> IN Z, PCI cards, leave behind an adapter between subchannels and PCI. >> We access PCI cards through 2 ways: >> - dedicated PCI instructions (pci_load/pci_store/pci/store_block) >> - DMA >> We receive events through >> - Adapter interrupts >> - CHSC events >> >> The adapter propose an IOMMU to protect the DMA >> and the interrupt handling goes through a MSIX like interface handled by >> the adapter. >> >> The architecture specific PCI do the interface between the standard PCI >> level and the zPCI function (PCI + DMA/IOMMU/Interrupt) >> >> To handle the communication through the "zPCI way" the CLP interface >> provides instructions to retrieve informations from the adapters. >> >> There are different group of functions having same functionalities. >> >> clp_list give us a list from zPCI functions >> clp_query_pci_function returns informations specific to a function >> clp_query_group returns information on a function group >> >> >> 2) Why do we need it in the guest >> >> We need to provide the guest with information on the adapters and zPCI >> functions returned by the clp_query instruction so that the guest's >> driver gets the right information on how the way to the zPCI function >> has been built in the host. >> >> >> When a guest issues the CLP instructions we intercept the clp command in >> QEMU and we need to feed the response with the right values for the guest. >> The "right" values are not the raw CLP response values: >> >> - some identifier must be virtualized, like UID and FID, >> >> - some values must match what the host received from the CLP response, >> like the size of the transmited blocks, the DMA Address Space Mask, >> number of interrupt, MSIA >> >> - some other must match what the host handled with the adapter and >> function, the start and end of DMA, >> >> - some what the host IOMMU driver supports (frame size), > > This seems very reminiscent of virtualizing PCI config space... so why > is this being proposed as a VFIO IOMMU ioctl extension? These are all > function level characteristics, right? Should this be a capability on > the VFIO device, or perhaps a region like we used for the Intel > OpRegion (though the structure size seems more akin to a capability > here)? As I mentioned in my previous reply, tying this into the IOMMU > interface seemed to rely on (I assume) an one-to-one-to-one mapping of > PCI function to IOMMU group to IOMMU domain, but that doesn't still > doesn't necessarily lend itself to using the IOMMU for device level > information. If there is IOMMU info, perhaps it needs to be split, ie. > expose a frame size via domain_get_attr, expose device level features > via a device capability, let QEMU assemble these into something > coherent to emulate the clp interface. > >> 3) We have three different way to get This information: >> >> The PCI Linux interface is a standard PCI interface and some Z specific >> information is available in sysfs. >> Not all the information needed to be returned inside the CLP response is >> available. >> So we can not use the sysfs interface to get all the information. >> >> There is a CLP ioctl interface but this interface is not secure in that >> it returns the information for all adapters in the system. >> >> The VFIO interface offers the advantage to point to a single PCI >> function, so more secure than the clp ioctl interface. >> Coupled with the s390_iommu we get access to the zPCI CLP instruction >> and to the values handled by the zPCI driver. >> >> >> 4) Until now we used to fill the CLP response to the guest inside QEMU >> with fixed values corresponding to the only PCI card we supported. >> To support new cards we need to get the right values from the kernel out. > > If it's already emulated, I much prefer figuring out how to expose the > right pieces of information via an appropriate interface to virtualize > fields that are actually necessary rather than simply providing an > interface to dump the clp info straight to userspace and pipe it to the > VM. Thanks, > > Alex > OK, I understand. Seems very clear, IOMMU features through IOMMU interface device features through device interface. Some times I do not understand what I did. Seems I messed up. Thanks to have take time to explain. Been back in a while with a better series. Regards, Pierre -- Pierre Morel Linux/KVM/QEMU in Böblingen - Germany
Powered by blists - more mailing lists