[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAHg+HgosCe=1T5ER-4AS1g779RxCDjcrazKV4CQC-43zDK-+Q@mail.gmail.com>
Date: Tue, 4 Aug 2015 11:18:19 +0530
From: Pranavkumar Sawargaonkar <pranavkumar@...aro.org>
To: Bhushan Bharat <Bharat.Bhushan@...escale.com>
Cc: "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
Alex Williamson <alex.williamson@...hat.com>,
"kvmarm@...ts.cs.columbia.edu" <kvmarm@...ts.cs.columbia.edu>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"christoffer.dall@...aro.org" <christoffer.dall@...aro.org>,
"marc.zyngier@....com" <marc.zyngier@....com>,
"will.deacon@....com" <will.deacon@....com>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>,
"arnd@...db.de" <arnd@...db.de>,
"rob.herring@...aro.org" <rob.herring@...aro.org>,
"eric.auger@...aro.org" <eric.auger@...aro.org>,
"patches@....com" <patches@....com>,
Stuart Yoder <stuart.yoder@...escale.com>
Subject: Re: [RFC 0/2] VFIO: Add virtual MSI doorbell support.
Hi Bharat,
On 28 July 2015 at 23:28, Alex Williamson <alex.williamson@...hat.com> wrote:
> On Tue, 2015-07-28 at 17:23 +0000, Bhushan Bharat wrote:
>> Hi Alex,
>>
>> > -----Original Message-----
>> > From: Alex Williamson [mailto:alex.williamson@...hat.com]
>> > Sent: Tuesday, July 28, 2015 9:52 PM
>> > To: Pranavkumar Sawargaonkar
>> > Cc: kvm@...r.kernel.org; kvmarm@...ts.cs.columbia.edu; linux-arm-
>> > kernel@...ts.infradead.org; linux-kernel@...r.kernel.org;
>> > christoffer.dall@...aro.org; marc.zyngier@....com; will.deacon@....com;
>> > bhelgaas@...gle.com; arnd@...db.de; rob.herring@...aro.org;
>> > eric.auger@...aro.org; patches@....com; Bhushan Bharat-R65777; Yoder
>> > Stuart-B08248
>> > Subject: Re: [RFC 0/2] VFIO: Add virtual MSI doorbell support.
>> >
>> > On Fri, 2015-07-24 at 14:33 +0530, Pranavkumar Sawargaonkar wrote:
>> > > In current VFIO MSI/MSI-X implementation, linux host kernel allocates
>> > > MSI/MSI-X vectors when userspace requests through vfio ioctls.
>> > > Vfio creates irqfd mappings to notify MSI/MSI-X interrupts to the
>> > > userspace when raised.
>> > > Guest OS will see emulated MSI/MSI-X controller and receives an
>> > > interrupt when kernel notifies the same via irqfd.
>> > >
>> > > Host kernel allocates MSI/MSI-X using standard linux routines like
>> > > pci_enable_msix_range() and pci_enable_msi_range().
>> > > These routines along with requset_irq() in host kernel sets up
>> > > MSI/MSI-X vectors with Physical MSI/MSI-X addresses provided by
>> > > interrupt controller driver in host kernel.
>> > >
>> > > This means when a device is assigned with the guest OS, MSI/MSI-X
>> > > addresses present in PCIe EP are the PAs programmed by the host linux
>> > kernel.
>> > >
>> > > In x86 MSI/MSI-X physical address range is reserved and iommu is aware
>> > > about these addreses and transalation is bypassed for these address range.
>> > >
>> > > Unlike x86, ARM/ARM64 does not reserve MSI/MSI-X Physical address
>> > > range and all the transactions including MSI go through iommu/smmu
>> > without bypass.
>> > > This requires extending current vfio MSI layer with additional
>> > > functionality for ARM/ARM64 by 1. Programing IOVA (referred as a MSI
>> > > virtual doorbell address)
>> > > in device's MSI vector as a MSI address.
>> > > This IOVA will be provided by the userspace based on the
>> > > MSI/MSI-X addresses reserved for the guest.
>> > > 2. Create an IOMMU mapping between this IOVA and
>> > > Physical address (PA) assigned to the MSI vector.
>> > >
>> > > This RFC is proposing a solution for MSI/MSI-X passthrough for
>> > ARM/ARM64.
>> >
>> >
>> > Hi Pranavkumar,
>> >
>> > Freescale has the same, or very similar, need, so any solution in this space
>> > will need to work for both ARM and powerpc. I'm not a big fan of this
>> > approach as it seems to require the user to configure MSI/X via ioctl and then
>> > call a separate ioctl mapping the doorbells. That's more code for the user,
>> > more code to get wrong and potentially a gap between configuring MSI/X
>> > and enabling mappings where we could see IOMMU faults.
>> >
>> > If we know that doorbell mappings are required, why can't we set aside a
>> > bank of IOVA space and have them mapped automatically as MSI/X is being
>> > configured? Then the user's need for special knowledge and handling of this
>> > case is limited to setup. The IOVA space will be mapped and used as needed,
>> > we only need the user to specify the IOVA space reserved for this. Thanks,
>>
>> We probably need a mix of both to support Freescale PowerPC and ARM
>> based machines.
>> In this mix mode kernel vfio driver will reserve some IOVA for mapping
>> MSI page/s.
>
> If vfio is reserving pages independently from the user, this becomes
> what Marc called "shaping" the VM and what x86 effectively does. An
> interface extension should expose these implicit regions so the user can
> avoid them for DMA memory mapping.
>
>> If any other iova mapping will overlap with this then it will return
>> error and user-space. Ideally this should be choosen in such a way
>> that it never overlap, which is easy on some systems but can be tricky
>> on some other system like Freescale PowerPC. This is not sufficient
>> for at-least Freescale PowerPC based SOC. This is because of hardware
>> limitation, where we need to fit this reserved iova address within
>> aperture decided by user-space. So if we allow user-space to change
>> this reserved iova address to a value decided by user-spece itself
>> then we can support both ARM/PowerPC based solutions.
>
> Yes, that's my intention, to allow userspace to specify the reserved
> region. I believe you have some additional restrictions on the number
> of MSI banks available and whether MSI banks can be shared, but I would
> hope that doesn't preclude a shared interface with ARM.
>
>> I have some implementation ready/tested with this approach and if this
>> approach looks good then I can submit a RFC patch.
>
> Yes, please post. Thanks,
Could you please share a tentative timeline by which you will be
posting your patches ?
Also are you planning to post counterpart patches for qemu or kvmtool ?
Thanks,
Pranav
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists