[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.02.1210241447210.8519@asgard.lang.hm>
Date: Wed, 24 Oct 2012 15:03:00 -0700 (PDT)
From: david@...g.hm
To: Nico Williams <nico@...ptonector.com>
cc: General Discussion of SQLite Database <sqlite-users@...ite.org>,
杨苏立 Yang Su Li <suli@...wisc.edu>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
drh@...ci.com
Subject: Re: [sqlite] light weight write barriers
On Wed, 24 Oct 2012, Nico Williams wrote:
>> Before that happens, people will keep returning again and again with those
>> simple questions: why the queue must be flushed for any ordered operation?
>> Isn't is an obvious overkill?
>
> That [cache flushing] is not what's being asked for here. Just a
> light-weight barrier. My proposal works without having to add new
> system calls: a) use a COW format, b) have background threads doing
> fsync()s, c) in each transaction's root block note the last
> known-committed (from a completed fsync()) transaction's root block,
> d) have an array of well-known ubberblocks large enough to accommodate
> as many transactions as possible without having to wait for any one
> fsync() to complete, d) do not reclaim space from any one past
> transaction until at least one subsequent transaction is fully
> committed. This obtains ACI- transaction semantics (survives power
> failures but without durability for the last N transactions at
> power-failure time) without requiring changes to the OS at all, and
> with support for delayed D (durability) notification.
I'm doing some work with rsyslog and it's disk-baded queues and there is a
similar issue there. The good news is that we can have a version that is
linux specific (rsyslog is used on other OSs, but there is an existing
queue implementation that they can use, if the faster one is linux-only,
but is significantly faster, that's just a win for Linux)
Like what is being described for sqlite, loosing the tail end of the
messages is not a big problem under normal conditions. But there is a need
to be sure that what is there is complete up to the point where it's lost.
this is similar in concept to write-ahead-logs done for databases (without
the absolute durability requirement)
1. new messages arrive and get added to the end of the queue file.
2. a thread updates the queue to indicate that it is in the process
of delivering a block of messages
3. the thread updates the queue to indicate that the block of messages has
been delivered
4. garbage collection happens to delete the old messages to free up space
(if queues go into files, this can just be to limit the file size,
spilling to multiple files, and when an old file is completely marked as
delivered, delete it)
I am not fully understanding how what you are describing (COW, separate
fsync threads, etc) would be implemented on top of existing filesystems.
Most of what you are describing seems like it requires access to the
underlying storage to implement.
could you give a more detailed explination?
David Lang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists