lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y9JnBDrm0V1ZdWK6@T590>
Date:   Thu, 26 Jan 2023 19:41:56 +0800
From:   Ming Lei <ming.lei@...hat.com>
To:     Willy Tarreau <w@....eu>
Cc:     Jens Axboe <axboe@...nel.dk>, io-uring@...r.kernel.org,
        linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
        nbd@...er.debian.org, ming.lei@...hat.com
Subject: Re: ublk-nbd: ublk-nbd is avaialbe

On Thu, Jan 26, 2023 at 05:08:22AM +0100, Willy Tarreau wrote:
> Hi,
> 
> On Thu, Jan 26, 2023 at 11:08:26AM +0800, Ming Lei wrote:
> > Hi Jens,
> > 
> > On Thu, Jan 19, 2023 at 11:49:04AM -0700, Jens Axboe wrote:
> > > On 1/19/23 7:23 AM, Ming Lei wrote:
> > > > Hi,
> > > > 
> > > > ublk-nbd[1] is available now.
> > > > 
> > > > Basically it is one nbd client, but totally implemented in userspace,
> > > > and wrt. current nbd-client in [2], the transmission phase is done
> > > > by linux block nbd driver.
> > > > 
> > > > The handshake implementation is borrowed from nbd project[2], so
> > > > basically ublk-nbd just adds new code for implementing transmission
> > > > phase, and it can be thought as moving linux block nbd driver into
> > > > userspace.
> > > > 
> > > > The added new code is basically in nbd/tgt_nbd.cpp, and io handling
> > > > is based on liburing[3], and implemented by c++20 coroutine, so
> > > > everything is done in single pthread totally lockless, meantime turns
> > > > out it is pretty easy to design & implement, attributed to ublk framework,
> > > > c++20 coroutine and liburing.
> > > > 
> > > > ublk-nbd supports both tcp and unix socket, and allows to enable io_uring
> > > > send zero copy via command line '--send_zc', see details in README[4].
> > > > 
> > > > No regression is found in xfstests by using ublk-nbd as both test device
> > > > and scratch device, and builtin test(make test T=nbd) runs well.
> > > > 
> > > > Fio test("make test T=nbd") shows that ublk-nbd performance is
> > > > basically same with nbd-client/nbd driver when running fio on real
> > > > ethernet link(1g, 10+g), but ublk-nbd IOPS is higher by ~40% than
> > > > nbd-client(nbd driver) with 512K BS, which is because linux nbd
> > > > driver sets max_sectors_kb as 64KB at default.
> > > > 
> > > > But when running fio over local tcp socket, it is observed in my test
> > > > machine that ublk-nbd performs better than nbd-client/nbd driver,
> > > > especially with 2 queue/2 jobs, and the gap could be 10% ~ 30%
> > > > according to different block size.
> > > 
> > > This is pretty nice! Just curious, have you tried setting up your
> > > ring with
> > > 
> > > p.flags |= IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN;
> > > 
> > > and see if that yields any extra performance improvements for you?
> > > Depending on how you do processing, you should not need to do any
> > > further changes there.
> > > 
> > > A "lighter" version is just setting IORING_SETUP_COOP_TASKRUN.
> > 
> > IORING_SETUP_COOP_TASKRUN is enabled in current ublksrv.
> > 
> > After disabling COOP_TASKRUN and enabling SINGLE_ISSUER & DEFER_TASKRUN,
> > not see obvious improvement, meantime regression is observed on 64k
> > rw.
> 
> Does it handle network errors better than the default nbd client, i.e.
> is it able to seamlessly reconnect after while keeping the same device
> or do you end up with multiple devices ? That's one big trouble I faced
> with the original nbd client, forcing you to unmount and remount
> everything after a network outage for example.

All kinds of ublk disk supports such seamlessly recovery which is
provided by UBLK_CMD_START_USER_RECOVERY/UBLK_CMD_END_USER_RECOVERY.
During user recovery, the bdev and gendisk instance won't be gone,
and will become fully functional after the recovery(such as reconnect)
is successful.

So yes for this seamlessly reconnect error handling.


Thanks, 
Ming

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ