[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACycT3vgaOrLVq+GDRK1PqqBRCkUAU0bYH=2CDvudsX0F9FBDA@mail.gmail.com>
Date: Mon, 11 Jul 2022 15:24:33 +0800
From: Yongji Xie <xieyongji@...edance.com>
To: Jason Wang <jasowang@...hat.com>
Cc: mst <mst@...hat.com>, Liu Xiaodong <xiaodong.liu@...el.com>,
Maxime Coquelin <maxime.coquelin@...hat.com>,
Stefan Hajnoczi <stefanha@...hat.com>,
songmuchun@...edance.com,
virtualization <virtualization@...ts.linux-foundation.org>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 0/5] VDUSE: Support registering userspace memory as
bounce buffer
On Mon, Jul 11, 2022 at 2:02 PM Jason Wang <jasowang@...hat.com> wrote:
>
> On Fri, Jul 8, 2022 at 5:53 PM Yongji Xie <xieyongji@...edance.com> wrote:
> >
> > On Fri, Jul 8, 2022 at 4:38 PM Jason Wang <jasowang@...hat.com> wrote:
> > >
> > > On Wed, Jul 6, 2022 at 6:16 PM Yongji Xie <xieyongji@...edance.com> wrote:
> > > >
> > > > On Wed, Jul 6, 2022 at 5:30 PM Jason Wang <jasowang@...hat.com> wrote:
> > > > >
> > > > > On Wed, Jul 6, 2022 at 1:05 PM Xie Yongji <xieyongji@...edance.com> wrote:
> > > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > This series introduces some new ioctls: VDUSE_IOTLB_GET_INFO,
> > > > > > VDUSE_IOTLB_REG_UMEM and VDUSE_IOTLB_DEREG_UMEM to support
> > > > > > registering and de-registering userspace memory for IOTLB
> > > > > > as bounce buffer in virtio-vdpa case.
> > > > > >
> > > > > > The VDUSE_IOTLB_GET_INFO ioctl can help user to query IOLTB
> > > > > > information such as bounce buffer size. Then user can use
> > > > > > those information on VDUSE_IOTLB_REG_UMEM and
> > > > > > VDUSE_IOTLB_DEREG_UMEM ioctls to register and de-register
> > > > > > userspace memory for IOTLB.
> > > > > >
> > > > > > During registering and de-registering, the DMA data in use
> > > > > > would be copied from kernel bounce pages to userspace bounce
> > > > > > pages and back.
> > > > > >
> > > > > > With this feature, some existing application such as SPDK
> > > > > > and DPDK can leverage the datapath of VDUSE directly and
> > > > > > efficiently as discussed before [1][2]. They can register
> > > > > > some preallocated hugepages to VDUSE to avoid an extra
> > > > > > memcpy from bounce-buffer to hugepages.
> > > > >
> > > > > This is really interesting.
> > > > >
> > > > > But a small concern on uAPI is that this seems to expose the VDUSE
> > > > > internal implementation (bounce buffer) to userspace. We tried hard to
> > > > > hide it via the GET_FD before. Anyway can we keep it?
> > > > >
> > > >
> > > > Another way is changing GET_FD ioctl to add a flag or reuse 'perm'
> > > > field to indicate whether a IOVA region supports userspace memory
> > > > registration. Then userspace can use
> > > > VDUSE_IOTLB_REG_UMEM/VDUSE_IOTLB_DEREG_UMEM to register/deregister
> > > > userspace memory for this IOVA region.
> > >
> > > Looks better.
> > >
> >
> > OK.
> >
> > > > Any suggestions?
> > >
> > > I wonder what's the value of keeping the compatibility with the kernel
> > > mmaped bounce buffer. It means we need to take extra care on e.g data
> > > copying when reg/reg user space memory.
> > >
> >
> > I'm not sure I get your point on the compatibility with the kernel
> > bounce buffer. Do you mean they use the same iova region?
>
> Yes.
>
> >
> > The userspace daemon might crash or reboot. In those cases, we still
> > need a kernel buffer to store/recover the data.
>
> Yes, this should be a good point.
>
> >
> > > Can we simply allow the third kind of fd that only works for umem registration?
> > >
> >
> > Do you mean using another iova region for umem?
>
> I meant having a new kind of fd that only allows umem registration.
>
OK. It seems to be a little complicated to allow mapping a registered
user memory via a new fd, e.g. how to handle the mapping if the
userspace daemon exits but the fd is already passed to another
process.
> >I think we don't need
> > a fd in umem case since the userspace daemon can access the memory
> > directly without using mmap() to map it into the address space in
> > advance.
>
> Ok, I will have a look at the code and get back.
>
OK. Looking forward to your reply.
Thanks,
Yongji
Powered by blists - more mailing lists