lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 13 Aug 2020 14:01:58 +0800
From:   Jason Wang <jasowang@...hat.com>
To:     "Tian, Kevin" <kevin.tian@...el.com>,
        Jason Gunthorpe <jgg@...dia.com>,
        Alex Williamson <alex.williamson@...hat.com>
Cc:     "Jiang, Dave" <dave.jiang@...el.com>,
        "vkoul@...nel.org" <vkoul@...nel.org>,
        "Dey, Megha" <megha.dey@...el.com>,
        "maz@...nel.org" <maz@...nel.org>,
        "bhelgaas@...gle.com" <bhelgaas@...gle.com>,
        "rafael@...nel.org" <rafael@...nel.org>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "hpa@...or.com" <hpa@...or.com>,
        "Pan, Jacob jun" <jacob.jun.pan@...el.com>,
        "Raj, Ashok" <ashok.raj@...el.com>,
        "Liu, Yi L" <yi.l.liu@...el.com>, "Lu, Baolu" <baolu.lu@...el.com>,
        "Kumar, Sanjay K" <sanjay.k.kumar@...el.com>,
        "Luck, Tony" <tony.luck@...el.com>,
        "Lin, Jing" <jing.lin@...el.com>,
        "Williams, Dan J" <dan.j.williams@...el.com>,
        "kwankhede@...dia.com" <kwankhede@...dia.com>,
        "eric.auger@...hat.com" <eric.auger@...hat.com>,
        "parav@...lanox.com" <parav@...lanox.com>,
        "Hansen, Dave" <dave.hansen@...el.com>,
        "netanelg@...lanox.com" <netanelg@...lanox.com>,
        "shahafs@...lanox.com" <shahafs@...lanox.com>,
        "yan.y.zhao@...ux.intel.com" <yan.y.zhao@...ux.intel.com>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>,
        "Ortiz, Samuel" <samuel.ortiz@...el.com>,
        "Hossain, Mona" <mona.hossain@...el.com>,
        "dmaengine@...r.kernel.org" <dmaengine@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "x86@...nel.org" <x86@...nel.org>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>
Subject: Re: [PATCH RFC v2 00/18] Add VFIO mediated device support and DEV-MSI
 support for the idxd driver


On 2020/8/13 下午1:26, Tian, Kevin wrote:
>> From: Jason Wang <jasowang@...hat.com>
>> Sent: Thursday, August 13, 2020 12:34 PM
>>
>>
>> On 2020/8/12 下午12:05, Tian, Kevin wrote:
>>>> The problem is that if we tie all controls via VFIO uAPI, the other
>>>> subsystem like vDPA is likely to duplicate them. I wonder if there is a
>>>> way to decouple the vSVA out of VFIO uAPI?
>>> vSVA is a per-device (either pdev or mdev) feature thus naturally should
>>> be managed by its device driver (VFIO or vDPA). From this angle some
>>> duplication is inevitable given VFIO and vDPA are orthogonal passthrough
>>> frameworks. Within the kernel the majority of vSVA handling is done by
>>> IOMMU and IOASID modules thus most logic are shared.
>>
>> So why not introduce vSVA uAPI at IOMMU or IOASID layer?
> One may ask a similar question why IOMMU doesn't expose map/unmap
> as uAPI...


I think this is probably a good idea as well. If there's anything missed 
in the infrastructure, we can invent. Besides vhost-vDPA, there are 
other subsystems that relaying their uAPI to IOMMU API. Duplicating 
uAPIs is usually a hint of the codes duplication. Simple map/unmap could 
be easy but vSVA uAPI is much more complicated.


>
>>
>>>>>     If an userspace DMA interface can be easily
>>>>> adapted to be a passthrough one, it might be the choice.
>>>> It's not that easy even for VFIO which requires a lot of new uAPIs and
>>>> infrastructures(e.g mdev) to be invented.
>>>>
>>>>
>>>>> But for idxd,
>>>>> we see mdev a much better fit here, given the big difference between
>>>>> what userspace DMA requires and what guest driver requires in this hw.
>>>> A weak point for mdev is that it can't serve kernel subsystem other than
>>>> VFIO. In this case, you need some other infrastructures (like [1]) to do
>>>> this.
>>> mdev is not exclusive from kernel usages. It's perfectly fine for a driver
>>> to reserve some work queues for host usages, while wrapping others
>>> into mdevs.
>>
>> I meant you may want slices to be an independent device from the kernel
>> point of view:
>>
>> E.g for ethernet devices, you may want 10K mdevs to be passed to guest.
>>
>> Similarly, you may want 10K net devices which is connected to the kernel
>> networking subsystems.
>>
>> In this case it's not simply reserving queues but you need some other
>> type of device abstraction. There could be some kind of duplication
>> between this and mdev.
>>
> yes, some abstraction required but isn't it what the driver should
> care about instead of mdev framework itself?


With mdev you present a "PCI" device, but what's kind of device it tries 
to present to kernel? If it's still PCI, there's duplication with mdev, 
if it's something new, maybe we can switch to that API.


> If the driver reports
> the same set of resource to both mdev and networking, it needs to
> make sure when the resource is claimed in one interface then it
> should be marked in-use in another. e.g. each mdev includes a
> available_intances attribute. the driver could report 10k available
> instances initially and then update it to 5K when another 5K is used
> for net devices later.


Right but this probably means you need another management layer under mdev.


>
> Mdev definitely has its usage limitations. Some may be improved
> in the future, some may not. But those are distracting from the
> original purpose of this thread (mdev vs. userspace DMA) and better
> be discussed in other places e.g. LPC...


Ok.

Thanks


>
> Thanks
> Kevin

Powered by blists - more mailing lists