lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <MWHPR11MB1645FC76A1866EA6A7C9DA6F8C5D0@MWHPR11MB1645.namprd11.prod.outlook.com>
Date:   Wed, 19 Aug 2020 07:29:49 +0000
From:   "Tian, Kevin" <kevin.tian@...el.com>
To:     Jason Gunthorpe <jgg@...dia.com>
CC:     Alex Williamson <alex.williamson@...hat.com>,
        "Jiang, Dave" <dave.jiang@...el.com>,
        "vkoul@...nel.org" <vkoul@...nel.org>,
        "Dey, Megha" <megha.dey@...el.com>,
        "maz@...nel.org" <maz@...nel.org>,
        "bhelgaas@...gle.com" <bhelgaas@...gle.com>,
        "rafael@...nel.org" <rafael@...nel.org>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "hpa@...or.com" <hpa@...or.com>,
        "Pan, Jacob jun" <jacob.jun.pan@...el.com>,
        "Raj, Ashok" <ashok.raj@...el.com>,
        "Liu, Yi L" <yi.l.liu@...el.com>, "Lu, Baolu" <baolu.lu@...el.com>,
        "Kumar, Sanjay K" <sanjay.k.kumar@...el.com>,
        "Luck, Tony" <tony.luck@...el.com>,
        "Lin, Jing" <jing.lin@...el.com>,
        "Williams, Dan J" <dan.j.williams@...el.com>,
        "kwankhede@...dia.com" <kwankhede@...dia.com>,
        "eric.auger@...hat.com" <eric.auger@...hat.com>,
        "parav@...lanox.com" <parav@...lanox.com>,
        "Hansen, Dave" <dave.hansen@...el.com>,
        "netanelg@...lanox.com" <netanelg@...lanox.com>,
        "shahafs@...lanox.com" <shahafs@...lanox.com>,
        "yan.y.zhao@...ux.intel.com" <yan.y.zhao@...ux.intel.com>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>,
        "Ortiz, Samuel" <samuel.ortiz@...el.com>,
        "Hossain, Mona" <mona.hossain@...el.com>,
        "dmaengine@...r.kernel.org" <dmaengine@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "x86@...nel.org" <x86@...nel.org>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>
Subject: RE: [PATCH RFC v2 00/18] Add VFIO mediated device support and DEV-MSI
 support for the idxd driver

> From: Jason Gunthorpe <jgg@...dia.com>
> Sent: Tuesday, August 18, 2020 7:50 PM
> 
> On Tue, Aug 18, 2020 at 01:09:01AM +0000, Tian, Kevin wrote:
> > The difference in my reply is not just about the implementation gap
> > of growing a userspace DMA framework to a passthrough framework.
> > My real point is about the different goals that each wants to achieve.
> > Userspace DMA is purely about allowing userspace to directly access
> > the portal and do DMA, but the wq configuration is always under kernel
> > driver's control. On the other hand, passthrough means delegating full
> > control of the wq to the guest and then associated mock-up (live migration,
> > vSVA, posted interrupt, etc.) for that to work. I really didn't see the
> > value of mixing them together when there is already a good candidate
> > to handle passthrough...
> 
> In Linux a 'VM' and virtualization has always been a normal system
> process that uses a few extra kernel features. This has been more or
> less the cornerstone of that design since the start.
> 
> In that view it doesn't make any sense to say that uAPI from idxd that
> is useful for virtualization somehow doesn't belong as part of the
> standard uAPI.

The point is that we already have a more standard uAPI (VFIO) which
is unified and vendor-agnostic to userspace. Creating a idxd specific
uAPI to absorb similar requirements that VFIO already does is not 
compelling and instead causes more trouble to Qemu or other VMMs 
as they need to deal with every such driver uAPI even when Qemu 
itself has no interest in the device detail (since the real user is inside 
guest). 

> 
> Especially when it is such a small detail like what APIs are used to
> configure the wq.
> 
> For instance, what about suspend/resume of containers using idxd?
> Wouldn't you want to have the same basic approach of controlling the
> wq from userspace that virtualization uses?
> 

I'm not familiar with how container suspend/resume is done today.
But my gut-feeling is that it's different from virtualization. For 
virtualization, the whole wq is assigned to the guest thus the uAPI
must provide a way to save the wq state including its configuration 
at suspsend, and then restore the state to what guest expects when
resume. However in container case which does userspace DMA, the
wq is managed by host kernel and could be shared between multiple
containers. So the wq state is irrelevant to container. The only relevant
state is the in-fly workloads which needs a draining interface. In this
view I think the two have a major difference.

Thanks
Kevin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ