[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080514150052.GA15826@2ka.mipt.ru>
Date: Wed, 14 May 2008 19:00:53 +0400
From: Evgeniy Polyakov <johnpol@....mipt.ru>
To: Jamie Lokier <jamie@...reable.org>
Cc: Sage Weil <sage@...dream.net>, Jeff Garzik <jeff@...zik.org>,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
linux-fsdevel@...r.kernel.org
Subject: Re: POHMELFS high performance network filesystem. Transactions, failover, performance.
On Wed, May 14, 2008 at 03:31:05PM +0100, Jamie Lokier (jamie@...reable.org) wrote:
> > If we are talking about agregate parallel perfromance, then its basic
> > protocol with 2 messages is (probably) optimal, but still I'm not
> > convinced, that 2 messages case is a good choise, I want one :)
>
> Look up "one-phase commit" or even "zero-phase commit". (The
> terminology is cheating a bit.) As I've understood it, all commit
> protocols have a step where each node guarantees it can commit if
> asked and node failure at that point does not invalidate the guarantee
> if the node recovers (if it can't maintain the guarantee, the node
> doesn't recover in a technical sense and a higher level protocol may
> reintegrate the node). One/zero-phase commit extends that to
> guaranteeing a certain amounts and types of data can be written before
> it knows what the data is, so write messages within that window are
> sufficient for global commits. Guarantees can be acquired
> asynchronously in advance of need, and can have time and other limits.
> These guarantees are no different in principle from the 1-bit
> guarantee offered by the "can you commit" phase of other commit
> protocols, so they aren't as weak as they seem.
If I understood that, client has to connect to all servers and send data
there, so that after single reply things got committed. That is
definitely not the issue, when there are lots of servers.
That can be the case if client connects to some gate server, which in
turn broadcasts data further, that is how I plan to implement things at
first.
Another approach, which seems also intersting is leader election (per
client), so that leader would broadcast all the data.
--
Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists