[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BY5PR21MB1506BE2D80B696923486D191CEE29@BY5PR21MB1506.namprd21.prod.outlook.com>
Date: Tue, 20 Jul 2021 18:52:49 +0000
From: Long Li <longli@...rosoft.com>
To: "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>
CC: Bart Van Assche <bvanassche@....org>,
Christoph Hellwig <hch@...radead.org>,
"longli@...uxonhyperv.com" <longli@...uxonhyperv.com>,
"linux-fs@...r.kernel.org" <linux-fs@...r.kernel.org>,
"linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>
Subject: RE: [Patch v4 0/3] Introduce a driver to support host accelerated
access to Microsoft Azure Blob
> Subject: Re: [Patch v4 0/3] Introduce a driver to support host accelerated
> access to Microsoft Azure Blob
>
> On Tue, Jul 20, 2021 at 05:33:47PM +0000, Long Li wrote:
> > > Subject: Re: [Patch v4 0/3] Introduce a driver to support host
> > > accelerated access to Microsoft Azure Blob
> > >
> > > On 7/20/21 12:05 AM, Long Li wrote:
> > > >> Subject: Re: [Patch v4 0/3] Introduce a driver to support host
> > > >> accelerated access to Microsoft Azure Blob
> > > >>
> > > >> On Mon, Jul 19, 2021 at 09:37:56PM -0700, Bart Van Assche wrote:
> > > >>> such that this object storage driver can be implemented as a
> > > >>> user-space library instead of as a kernel driver? As you may
> > > >>> know vfio users can either use eventfds for completion notifications
> or polling.
> > > >>> An interface like io_uring can be built easily on top of vfio.
> > > >>
> > > >> Yes. Similar to say the NVMe K/V command set this does not look
> > > >> like a candidate for a kernel driver.
> > > >
> > > > The driver is modeled to support multiple processes/users over a
> > > > VMBUS channel. I don't see a way that this can be implemented
> through VFIO?
> > > >
> > > > Even if it can be done, this exposes a security risk as the same
> > > > VMBUS channel is shared by multiple processes in user-mode.
> > >
> > > Sharing a VMBUS channel among processes is not necessary. I propose
> > > to assign one VMBUS channel to each process and to multiplex I/O
> > > submitted to channels associated with the same blob storage object
> > > inside e.g. the hypervisor. This is not a new idea. In the NVMe
> > > specification there is a diagram that shows that multiple NVMe
> > > controllers can provide access to the same NVMe namespace. See also
> > > diagram "Figure 416: NVM Subsystem with Three I/O Controllers" in
> version 1.4 of the NVMe specification.
> > >
> > > Bart.
> >
> > Currently, the Hyper-V is not designed to have one VMBUS channel for
> each process.
>
> So it's a slow interface :(
>
> > In Hyper-V, a channel is offered from the host to the guest VM. The
> > host doesn't know in advance how many processes are going to use this
> > service so it can't offer those channels in advance. There is no
> > mechanism to offer dynamic per-process allocated channels based on
> guest needs. Some devices (e.g.
> > network and storage) use multiple channels for scalability but they
> > are not for serving individual processes.
> >
> > Assigning one VMBUS channel per process needs significant change on the
> Hyper-V side.
>
> What is the throughput of a single channel as-is? You provided no
> benchmarks or numbers at all in this patchset which would justify this new
> kernel driver :(
Test data shows a single channel is not a limitation of the target workload.
The VSP/VSC protocol is designed to avoid data copy as much as possible.
Being a VMBUS device, the Hyper-V is capable of allocating multiple channels
on different CPUs if a single channel proves to be a bottleneck.
Preliminary test results show the performance increase of up to 30% in data
center environment, while at the same time reducing the number of servers
and CPUs serving Blob requests, as compared to going through the complete
HTTP stack. This also enables the use of transport technology directly to backend
server that are not available to VMs (for example RDMA transport) due to security reasons.
Long
>
> thanks,
>
> greg k-h
Powered by blists - more mailing lists