[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190319152212.GC24176@localhost.localdomain>
Date: Tue, 19 Mar 2019 09:22:13 -0600
From: Keith Busch <kbusch@...nel.org>
To: Maxim Levitsky <mlevitsk@...hat.com>
Cc: linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org,
kvm@...r.kernel.org, Jens Axboe <axboe@...com>,
Alex Williamson <alex.williamson@...hat.com>,
Keith Busch <keith.busch@...el.com>,
Christoph Hellwig <hch@....de>,
Sagi Grimberg <sagi@...mberg.me>,
Kirti Wankhede <kwankhede@...dia.com>,
"David S . Miller" <davem@...emloft.net>,
Mauro Carvalho Chehab <mchehab+samsung@...nel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Wolfram Sang <wsa@...-dreams.de>,
Nicolas Ferre <nicolas.ferre@...rochip.com>,
"Paul E . McKenney " <paulmck@...ux.ibm.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Liang Cunming <cunming.liang@...el.com>,
Liu Changpeng <changpeng.liu@...el.com>,
Fam Zheng <fam@...hon.net>, Amnon Ilan <ailan@...hat.com>,
John Ferlan <jferlan@...hat.com>
Subject: Re: your mail
On Tue, Mar 19, 2019 at 04:41:07PM +0200, Maxim Levitsky wrote:
> -> Share the NVMe device between host and guest.
> Even in fully virtualized configurations,
> some partitions of nvme device could be used by guests as block devices
> while others passed through with nvme-mdev to achieve balance between
> all features of full IO stack emulation and performance.
>
> -> NVME-MDEV is a bit faster due to the fact that in-kernel driver
> can send interrupts to the guest directly without a context
> switch that can be expensive due to meltdown mitigation.
>
> -> Is able to utilize interrupts to get reasonable performance.
> This is only implemented
> as a proof of concept and not included in the patches,
> but interrupt driven mode shows reasonable performance
>
> -> This is a framework that later can be used to support NVMe devices
> with more of the IO virtualization built-in
> (IOMMU with PASID support coupled with device that supports it)
Would be very interested to see the PASID support. You wouldn't even
need to mediate the IO doorbells or translations if assigning entire
namespaces, and should be much faster than the shadow doorbells.
I think you should send 6/9 "nvme/pci: init shadow doorbell after each
reset" separately for immediate inclusion.
I like the idea in principle, but it will take me a little time to get
through reviewing your implementation. I would have guessed we could
have leveraged something from the existing nvme/target for the mediating
controller register access and admin commands. Maybe even start with
implementing an nvme passthrough namespace target type (we currently
have block and file).
Powered by blists - more mailing lists