[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151029015422.GT8773@dastard>
Date: Thu, 29 Oct 2015 12:54:22 +1100
From: Dave Chinner <david@...morbit.com>
To: Andres Freund <andres@...razel.de>
Cc: linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: Triggering non-integrity writeback from userspace
On Thu, Oct 29, 2015 at 12:23:12AM +0100, Andres Freund wrote:
> Hi,
>
> On 2015-10-29 07:48:34 +1100, Dave Chinner wrote:
> > > The idea of using SYNC_FILE_RANGE_WRITE beforehand is that
> > > the fsync() will only have to do very little work. The language in
> > > sync_file_range(2) doesn't inspire enough confidence for using it as an
> > > actual integrity operation :/
> >
> > So really you're trying to minimise the blocking/latency of fsync()?
>
> The blocking/latency of the fsync doesn't actually matter at all *for
> this callsite*. It's called from a dedicated background process - if
> it's slowed down by a couple seconds it doesn't matter much.
> The problem is that if you have a couple gigabytes of dirty data being
> fsync()ed at once, latency for concurrent reads and writes often goes
> absolutely apeshit. And those concurrent reads and writes might
> actually be latency sensitive.
Right, but my point is with an async fsync/fdatasync you don't need
this background process - you can just trickle out async fdatasync
calls instead of trckling out calls to sync_file_range().
> By calling sync_file_range() over small ranges of pages shortly after
> they've been written we make it unlikely (but still possible) that much
> data has to be flushed at fsync() time.
Right, but you still need the fsync call, whereas with a async fsync
call you don't - when you gather the completion, no further action
needs to be taken on that dirty range.
> At the moment using fdatasync() instead of fsync() is a considerable
> performance advantage... If I understand the above proposal correctly,
> it'd allow specifying ranges, is that right?
Well, the patch I sent doesn't do ranges, but it could easily be
passed in as the iocb has offset/len parameters that are used by
IOCB_CMD_PREAD/PWRITE. io_prep_fsync/io_fsync both memset the iocb
to zero, so if we pass in a non-zero length, we could treat it as a
ranged f(d)sync quite easily.
> There'll be some concern about portability around this - issuing
> sync_file_range() every now and then isn't particularly invasive. Using
> aio might end up being that, not sure.
It's still a non-portable/linux only solution, because it is using
the linux native aio interface, not the glibc one...
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists