lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 18 May 2022 21:18:45 +0800
From:   Ming Lei <ming.lei@...hat.com>
To:     Liu Xiaodong <xiaodong.liu@...el.com>
Cc:     linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
        Harris James R <james.r.harris@...el.com>,
        io-uring@...r.kernel.org,
        Gabriel Krisman Bertazi <krisman@...labora.com>,
        ZiyangZhang <ZiyangZhang@...ux.alibaba.com>,
        Xiaoguang Wang <xiaoguang.wang@...ux.alibaba.com>,
        Stefan Hajnoczi <stefanha@...hat.com>,
        Jens Axboe <axboe@...nel.dk>, ming.lei@...hat.com
Subject: Re: [PATCH V2 0/1] ubd: add io_uring based userspace block driver

Hello Liu,

On Wed, May 18, 2022 at 02:38:08AM -0400, Liu Xiaodong wrote:
> On Tue, May 17, 2022 at 01:53:57PM +0800, Ming Lei wrote:
> > Hello Guys,
> > 
> > ubd driver is one kernel driver for implementing generic userspace block
> > device/driver, which delivers io request from ubd block device(/dev/ubdbN) into
> > ubd server[1] which is the userspace part of ubd for communicating
> > with ubd driver and handling specific io logic by its target module.
> > 
> > Another thing ubd driver handles is to copy data between user space buffer
> > and request/bio's pages, or take zero copy if mm is ready for support it in
> > future. ubd driver doesn't handle any IO logic of the specific driver, so
> > it is small/simple, and all io logics are done by the target code in ubdserver.
> > 
> > The above two are main jobs done by ubd driver.
> 
> Hi, Lei
> 
> Your UBD implementation looks great. Its io_uring based design is interesting
> and brilliant.
> Towards the purpose of userspace block device, last year,
> VDUSE initialized by Yongji is going to do a similar work. But VDUSE is under
> vdpa. VDUSE will present a virtio-blk device to other userspace process
> like containers, while serving virtio-blk req also by an userspace target.
> https://lists.linuxfoundation.org/pipermail/iommu/2021-June/056956.html 
> 
> I've been working and thinking on serving RUNC container by SPDK efficiently.
> But this work requires a new proper userspace block device module in kernel.
> The highlevel design idea for userspace block device implementations
> should be that: Using ring for IO request, so client and target can exchange
> req/resp quickly in batch; Map bounce buffer between kernel and userspace
> target, so another extra IO data copy like NBD can be avoid. (Oh, yes, SPDK
> needs this kernel module has some more minor functions)
> 
> UBD and VDUSE are both implemented in this way, while of course each of
> them has specific features and advantages.
> 
> Not like UBD which is straightforward and starts from scratch, VDUSE is
> embedded in virtio framework. So its implementation is more complicated, but
> all virtio frontend utilities can be leveraged.
> When considering security/permission issues, feels UBD would be easier to
> solve them.

Stefan Hajnoczi and I are discussing related security/permission
issues, can you share more details in your case?

> 
> So my questions are:
> 1. what do you think about the purpose overlap between UBD and VDUSE?

Sorry, I am not familiar with VDUSE, motivation of ubd is just to make one
high performance generic userspace block driver. ubd driver(kernel part) is
just responsible for communication and copying data between userspace buffers
and kernel io request pages, and the ubdsrv(userspace) target handles io
logic.

> 2. Could UBD be implemented with SPDK friendly functionalities? (mainly about
> io data mapping, since HW devices in SPDK need to access the mapped data
> buffer. Then, in function ubdsrv.c/ubdsrv_init_io_bufs(),
> "addr = mmap(,,,,dev->cdev_fd,)",

No, that code is actually for supporting zero copy.

But each request's buffer is allocated by ubdsrv and definitely available for any
target, please see loop_handle_io_async() which handles IO from /dev/ubdbN about
how to use the buffer. Fro READ, the target code needs to implement READ
logic and fill data to the buffer, then the buffer will be copied to
kernel io request pages; for WRITE, the target code needs to use the buffer to handle
WRITE and the buffer has been updated with kernel io request.

> SPDK needs to know the PA of "addr".

What is PA? and why?

Userspace can only see VM of each buffer. 

> Also memory pointed by "addr" should be pinned all the time.)

The current implementation only pins pages when copying data between
userspace buffers and kernel io request pages. But I plan to support
three pin behavior:

- never (current behavior, just pin pages when copying pages)
- lazy (pin pages until the request is idle for enough time)
- always (all pages in userpace VM are pinned during the device lifetime)


Thanks, 
Ming

Powered by blists - more mailing lists