[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0805141516540.23143@cobra.newdream.net>
Date: Wed, 14 May 2008 15:26:31 -0700 (PDT)
From: Sage Weil <sage@...dream.net>
To: Jeff Garzik <jeff@...zik.org>
Cc: Evgeniy Polyakov <johnpol@....mipt.ru>,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
linux-fsdevel@...r.kernel.org
Subject: Re: POHMELFS high performance network filesystem. Transactions,
failover, performance.
On Wed, 14 May 2008, Jeff Garzik wrote:
> Sage Weil wrote:
> > You mean if, say, some verifiable metadata or a trusted third party stores
> > that checksum? Sure. This is just pushing the what-has-committed
>
> Yes.
>
> > information to some other party, though, who will presumably face the same
> > problem of requiring a majority for verifiable correctness. This is more or
> > less what most people do in practice... using Paxos for critical state and
> > piggybacking the rest of the system's consistency off of that.
>
> More like receiving a guarantee of consensus (just like any signature on
> data), while only needing to be able to communicate with a single node.
It's the 'single node' part that concerns me. As long as that node is
ensuring there is consensus behind the scenes before handing out said
signature. Otherwise you can't be sure you're not getting an old
signature for old data..
This is more or less what I ended up doing. Since the workload is
mostly-read, the paxos leader gives non-leaders leases to process reads in
parallel, and new elections or state changes wait if necessary to ensure
old leases are revoked or expire before any new leases on new values are
issued.
sage
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists