lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <488768D7-1396-4DD1-A648-C86E5CF7DB2F@nutanix.com>
Date:   Wed, 20 Mar 2019 11:03:29 +0000
From:   Felipe Franciosi <felipe@...anix.com>
To:     Maxim Levitsky <mlevitsk@...hat.com>
CC:     "linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        Jens Axboe <axboe@...com>,
        Alex Williamson <alex.williamson@...hat.com>,
        Keith Busch <keith.busch@...el.com>,
        Christoph Hellwig <hch@....de>,
        Sagi Grimberg <sagi@...mberg.me>,
        Kirti Wankhede <kwankhede@...dia.com>,
        "David S . Miller" <davem@...emloft.net>,
        Mauro Carvalho Chehab <mchehab+samsung@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Wolfram Sang <wsa@...-dreams.de>,
        Nicolas Ferre <nicolas.ferre@...rochip.com>,
        "Paul E . McKenney" <paulmck@...ux.ibm.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Liang Cunming <cunming.liang@...el.com>,
        Liu Changpeng <changpeng.liu@...el.com>,
        Fam Zheng <fam@...hon.net>, Amnon Ilan <ailan@...hat.com>,
        John Ferlan <jferlan@...hat.com>,
        Stefan Hajnoczi <stefanha@...hat.com>,
        "Harris, James R" <james.r.harris@...el.com>,
        Thanos Makatos <thanos.makatos@...anix.com>
Subject: Re: 


> On Mar 19, 2019, at 2:41 PM, Maxim Levitsky <mlevitsk@...hat.com> wrote:
> 
> Date: Tue, 19 Mar 2019 14:45:45 +0200
> Subject: [PATCH 0/9] RFC: NVME VFIO mediated device
> 
> Hi everyone!
> 
> In this patch series, I would like to introduce my take on the problem of doing 
> as fast as possible virtualization of storage with emphasis on low latency.
> 
> In this patch series I implemented a kernel vfio based, mediated device that 
> allows the user to pass through a partition and/or whole namespace to a guest.

Hey Maxim!

I'm really excited to see this series, as it aligns to some extent with what we discussed in last year's KVM Forum VFIO BoF.

There's no arguing that we need a better story to efficiently virtualise NVMe devices. So far, for Qemu-based VMs, Changpeng's vhost-user-nvme is the best attempt at that. However, I seem to recall there was some pushback from qemu-devel in the sense that they would rather see investment in virtio-blk. I'm not sure what's the latest on that work and what are the next steps.

The pushback drove the discussion towards pursuing an mdev approach, which is why I'm excited to see your patches.

What I'm thinking is that passing through namespaces or partitions is very restrictive. It leaves no room to implement more elaborate virtualisation stacks like replicating data across multiple devices (local or remote), storage migration, software-managed thin provisioning, encryption, deduplication, compression, etc. In summary, anything that requires software intervention in the datapath. (Worth noting: vhost-user-nvme allows all of that to be easily done in SPDK's bdev layer.)

These complicated stacks should probably not be implemented in the kernel, though. So I'm wondering whether we could talk about mechanisms to allow efficient and performant userspace datapath intervention  in your approach or pursue a mechanism to completely offload the device emulation to userspace (and align with what SPDK has to offer).

Thoughts welcome!
Felipe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ