[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230126125433.GA4368@1wt.eu>
Date: Thu, 26 Jan 2023 13:54:33 +0100
From: Willy Tarreau <w@....eu>
To: Ming Lei <ming.lei@...hat.com>
Cc: Jens Axboe <axboe@...nel.dk>, io-uring@...r.kernel.org,
linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
nbd@...er.debian.org
Subject: Re: ublk-nbd: ublk-nbd is avaialbe
On Thu, Jan 26, 2023 at 07:41:56PM +0800, Ming Lei wrote:
> On Thu, Jan 26, 2023 at 05:08:22AM +0100, Willy Tarreau wrote:
> > Hi,
> >
> > On Thu, Jan 26, 2023 at 11:08:26AM +0800, Ming Lei wrote:
> > > Hi Jens,
> > >
> > > On Thu, Jan 19, 2023 at 11:49:04AM -0700, Jens Axboe wrote:
> > > > On 1/19/23 7:23 AM, Ming Lei wrote:
> > > > > Hi,
> > > > >
> > > > > ublk-nbd[1] is available now.
> > > > >
> > > > > Basically it is one nbd client, but totally implemented in userspace,
> > > > > and wrt. current nbd-client in [2], the transmission phase is done
> > > > > by linux block nbd driver.
> > > > >
> > > > > The handshake implementation is borrowed from nbd project[2], so
> > > > > basically ublk-nbd just adds new code for implementing transmission
> > > > > phase, and it can be thought as moving linux block nbd driver into
> > > > > userspace.
> > > > >
> > > > > The added new code is basically in nbd/tgt_nbd.cpp, and io handling
> > > > > is based on liburing[3], and implemented by c++20 coroutine, so
> > > > > everything is done in single pthread totally lockless, meantime turns
> > > > > out it is pretty easy to design & implement, attributed to ublk framework,
> > > > > c++20 coroutine and liburing.
> > > > >
> > > > > ublk-nbd supports both tcp and unix socket, and allows to enable io_uring
> > > > > send zero copy via command line '--send_zc', see details in README[4].
> > > > >
> > > > > No regression is found in xfstests by using ublk-nbd as both test device
> > > > > and scratch device, and builtin test(make test T=nbd) runs well.
> > > > >
> > > > > Fio test("make test T=nbd") shows that ublk-nbd performance is
> > > > > basically same with nbd-client/nbd driver when running fio on real
> > > > > ethernet link(1g, 10+g), but ublk-nbd IOPS is higher by ~40% than
> > > > > nbd-client(nbd driver) with 512K BS, which is because linux nbd
> > > > > driver sets max_sectors_kb as 64KB at default.
> > > > >
> > > > > But when running fio over local tcp socket, it is observed in my test
> > > > > machine that ublk-nbd performs better than nbd-client/nbd driver,
> > > > > especially with 2 queue/2 jobs, and the gap could be 10% ~ 30%
> > > > > according to different block size.
> > > >
> > > > This is pretty nice! Just curious, have you tried setting up your
> > > > ring with
> > > >
> > > > p.flags |= IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN;
> > > >
> > > > and see if that yields any extra performance improvements for you?
> > > > Depending on how you do processing, you should not need to do any
> > > > further changes there.
> > > >
> > > > A "lighter" version is just setting IORING_SETUP_COOP_TASKRUN.
> > >
> > > IORING_SETUP_COOP_TASKRUN is enabled in current ublksrv.
> > >
> > > After disabling COOP_TASKRUN and enabling SINGLE_ISSUER & DEFER_TASKRUN,
> > > not see obvious improvement, meantime regression is observed on 64k
> > > rw.
> >
> > Does it handle network errors better than the default nbd client, i.e.
> > is it able to seamlessly reconnect after while keeping the same device
> > or do you end up with multiple devices ? That's one big trouble I faced
> > with the original nbd client, forcing you to unmount and remount
> > everything after a network outage for example.
>
> All kinds of ublk disk supports such seamlessly recovery which is
> provided by UBLK_CMD_START_USER_RECOVERY/UBLK_CMD_END_USER_RECOVERY.
> During user recovery, the bdev and gendisk instance won't be gone,
> and will become fully functional after the recovery(such as reconnect)
> is successful.
>
> So yes for this seamlessly reconnect error handling.
Nice, it's tempting to give it a try then ;-)
Willy
Powered by blists - more mailing lists