[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e3f45862-c3a5-8bac-e04d-7be0e76908a9@redhat.com>
Date: Thu, 13 Aug 2020 14:01:58 +0800
From: Jason Wang <jasowang@...hat.com>
To: "Tian, Kevin" <kevin.tian@...el.com>,
Jason Gunthorpe <jgg@...dia.com>,
Alex Williamson <alex.williamson@...hat.com>
Cc: "Jiang, Dave" <dave.jiang@...el.com>,
"vkoul@...nel.org" <vkoul@...nel.org>,
"Dey, Megha" <megha.dey@...el.com>,
"maz@...nel.org" <maz@...nel.org>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>,
"rafael@...nel.org" <rafael@...nel.org>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"hpa@...or.com" <hpa@...or.com>,
"Pan, Jacob jun" <jacob.jun.pan@...el.com>,
"Raj, Ashok" <ashok.raj@...el.com>,
"Liu, Yi L" <yi.l.liu@...el.com>, "Lu, Baolu" <baolu.lu@...el.com>,
"Kumar, Sanjay K" <sanjay.k.kumar@...el.com>,
"Luck, Tony" <tony.luck@...el.com>,
"Lin, Jing" <jing.lin@...el.com>,
"Williams, Dan J" <dan.j.williams@...el.com>,
"kwankhede@...dia.com" <kwankhede@...dia.com>,
"eric.auger@...hat.com" <eric.auger@...hat.com>,
"parav@...lanox.com" <parav@...lanox.com>,
"Hansen, Dave" <dave.hansen@...el.com>,
"netanelg@...lanox.com" <netanelg@...lanox.com>,
"shahafs@...lanox.com" <shahafs@...lanox.com>,
"yan.y.zhao@...ux.intel.com" <yan.y.zhao@...ux.intel.com>,
"pbonzini@...hat.com" <pbonzini@...hat.com>,
"Ortiz, Samuel" <samuel.ortiz@...el.com>,
"Hossain, Mona" <mona.hossain@...el.com>,
"dmaengine@...r.kernel.org" <dmaengine@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>
Subject: Re: [PATCH RFC v2 00/18] Add VFIO mediated device support and DEV-MSI
support for the idxd driver
On 2020/8/13 下午1:26, Tian, Kevin wrote:
>> From: Jason Wang <jasowang@...hat.com>
>> Sent: Thursday, August 13, 2020 12:34 PM
>>
>>
>> On 2020/8/12 下午12:05, Tian, Kevin wrote:
>>>> The problem is that if we tie all controls via VFIO uAPI, the other
>>>> subsystem like vDPA is likely to duplicate them. I wonder if there is a
>>>> way to decouple the vSVA out of VFIO uAPI?
>>> vSVA is a per-device (either pdev or mdev) feature thus naturally should
>>> be managed by its device driver (VFIO or vDPA). From this angle some
>>> duplication is inevitable given VFIO and vDPA are orthogonal passthrough
>>> frameworks. Within the kernel the majority of vSVA handling is done by
>>> IOMMU and IOASID modules thus most logic are shared.
>>
>> So why not introduce vSVA uAPI at IOMMU or IOASID layer?
> One may ask a similar question why IOMMU doesn't expose map/unmap
> as uAPI...
I think this is probably a good idea as well. If there's anything missed
in the infrastructure, we can invent. Besides vhost-vDPA, there are
other subsystems that relaying their uAPI to IOMMU API. Duplicating
uAPIs is usually a hint of the codes duplication. Simple map/unmap could
be easy but vSVA uAPI is much more complicated.
>
>>
>>>>> If an userspace DMA interface can be easily
>>>>> adapted to be a passthrough one, it might be the choice.
>>>> It's not that easy even for VFIO which requires a lot of new uAPIs and
>>>> infrastructures(e.g mdev) to be invented.
>>>>
>>>>
>>>>> But for idxd,
>>>>> we see mdev a much better fit here, given the big difference between
>>>>> what userspace DMA requires and what guest driver requires in this hw.
>>>> A weak point for mdev is that it can't serve kernel subsystem other than
>>>> VFIO. In this case, you need some other infrastructures (like [1]) to do
>>>> this.
>>> mdev is not exclusive from kernel usages. It's perfectly fine for a driver
>>> to reserve some work queues for host usages, while wrapping others
>>> into mdevs.
>>
>> I meant you may want slices to be an independent device from the kernel
>> point of view:
>>
>> E.g for ethernet devices, you may want 10K mdevs to be passed to guest.
>>
>> Similarly, you may want 10K net devices which is connected to the kernel
>> networking subsystems.
>>
>> In this case it's not simply reserving queues but you need some other
>> type of device abstraction. There could be some kind of duplication
>> between this and mdev.
>>
> yes, some abstraction required but isn't it what the driver should
> care about instead of mdev framework itself?
With mdev you present a "PCI" device, but what's kind of device it tries
to present to kernel? If it's still PCI, there's duplication with mdev,
if it's something new, maybe we can switch to that API.
> If the driver reports
> the same set of resource to both mdev and networking, it needs to
> make sure when the resource is claimed in one interface then it
> should be marked in-use in another. e.g. each mdev includes a
> available_intances attribute. the driver could report 10k available
> instances initially and then update it to 5K when another 5K is used
> for net devices later.
Right but this probably means you need another management layer under mdev.
>
> Mdev definitely has its usage limitations. Some may be improved
> in the future, some may not. But those are distracting from the
> original purpose of this thread (mdev vs. userspace DMA) and better
> be discussed in other places e.g. LPC...
Ok.
Thanks
>
> Thanks
> Kevin
Powered by blists - more mailing lists