lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGF4SLjKD_=ra4A6CQFv5bsETNoEyVKjA7FRBek7DwuWjyNbCA@mail.gmail.com>
Date:   Tue, 6 Nov 2018 15:08:26 -0500
From:   Vitaly Mayatskih <v.mayatskih@...il.com>
To:     stefanha@...il.com
Cc:     Jason Wang <jasowang@...hat.com>,
        Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
        virtualization@...ts.linux-foundation.org,
        linux-kernel@...r.kernel.org, Kevin Wolf <kwolf@...hat.com>,
        "Michael S . Tsirkin" <mst@...hat.com>, den@...tuozzo.com
Subject: Re: [PATCH 0/1] vhost: add vhost_blk driver

On Tue, Nov 6, 2018 at 10:40 AM Stefan Hajnoczi <stefanha@...il.com> wrote:

> Previously vhost_blk.ko implementations were basically the same thing as
> the QEMU x-data-plane=on (dedicated thread using Linux AIO), except they
> were using a kernel thread and maybe submitted bios.
>
> The performance differences weren't convincing enough that it seemed
> worthwhile maintaining another code path which loses live migration, I/O
> throttling, image file formats, etc (all the things that QEMU's block
> layer supports).
>
> Two changes since then:
>
> 1. x-data-plane=on has been replaced with a full trip down QEMU's block
> layer (-object iothread,id=iothread0 -device
> virtio-blk-pci,iothread=iothread0,...).  It's slower and not truly
> multiqueue (yet!).
>
> So from this perspective vhost_blk.ko might be more attractive again, at
> least until further QEMU block layer work eliminates the multiqueue and
> performance overheads.

Yes, this work is a direct consequence of insufficient performance of
virtio-blk's host side. I'm working on a storage driver, but there's
no a good way to feed all these IOs into one disk of one VM. The
nature of storage design dictates the need of very high IOPS seen by
VM. This is only one tiny use case of course, but the vhost/QEMU
change is small enough to share.

> 2. SPDK has become available for users who want the best I/O performance
> and are willing to sacrifice CPU cores for polling.
>
> If you want better performance and don't care about QEMU block layer
> features, could you use SPDK?  People who are the target market for
> vhost_blk.ko would probably be willing to use SPDK and it already
> exists...

Yes. Though in my experience SPDK creates more problems most of times
than it solves ;) What I find very compelling in using a plain Linux
block device is that it is really fast these days (blk-mq) and the
device mapper can be used for even greater flexibility. Device mapper
is less than perfect performance-wise and at some point will need some
work for sure, but still can push few million IOPS through. And it's
all standard code with decades old user APIs.

In fact, Linux kernel is so good now that our pure-software solution
can push IO at rates up to the limits of fat hardware (x00 GbE, bunch
of NVMes) without an apparent need for hardware acceleration. And,
without hardware dependencies, it is much more flexible. Disk
interface between the host and VM was the only major bottleneck.

> From the QEMU userspace perspective, I think the best way to integrate
> vhost_blk.ko is to transparently switch to it when possible.  If the
> user enables QEMU block layer features that are incompatible with
> vhost_blk.ko, then it should fall back to the QEMU block layer
> transparently.

Sounds like an excellent idea! I'll do that. Most of vhost-blk support
in QEMU is a boilerplate code anyways.

> I'm not keen on yet another code path with it's own set of limitations
> and having to educate users about how to make the choice.  But if it can
> be integrated transparently as an "accelerator", then it could be
> valuable.

Understood. Agree.

Thanks!
-- 
wbr, Vitaly

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ