[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0806150918460.3341@cobra.newdream.net>
Date: Sun, 15 Jun 2008 09:41:44 -0700 (PDT)
From: Sage Weil <sage@...dream.net>
To: Evgeniy Polyakov <johnpol@....mipt.ru>
Cc: Jamie Lokier <jamie@...reable.org>, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [2/3] POHMELFS: Documentation.
On Sun, 15 Jun 2008, Evgeniy Polyakov wrote:
> Yes, not only writepage, but any request - if it sends sequest and then
> receives reply (i.e. doing send/recv sequence without ability to do
> something else in between or allow other users to do sends or receives
> into the same socket), then it is synchronous. If it only sends, and
> someone else receives, it is possible to send multiple requests from
> different users who do reads or writes or lookups or whatever and
> asynchronously in different thread receive replies not in particular
> order, so this approach I call asynchronous.
Oh, so you just mean that the caller doesn't, say, hold a mutex for the
socket for the duration of the send _and_ recv? I'm kind of shocked that
anyone does that, although I suppose in some cases the protocol
effectively demands it.
> Yes, POHMELFS does writing that way.
Nice. I will definitely be taking a look at that.
> Not exactly. Transaction in a nutshell is a wrapper on top of command
> (or multiple commands if needed like in writing), which contains all
> information needed to perform appropriate action. When user calls read()
> or 'ls' or write() or whatever, POHMELFS creates transaction for that
> operation and tries to perform it (if operation is not cached, in that
> case nothing actually happens). When transaction is submitted, it
> becomes part of the failover state machine which will check if data has
> to be read from different server or written to new one or dropped.
> original caller may not even know from which server its data will be
> received. If request sending failed in the middle, the whole transaction
> will be redirected to new one. It is also possible to redo transaction
> against different server, if server sent us error (like I'm busy), but
> this functionality was dropped in previous release iirc, this can be
> resurrected though. Having generic transaction tree callers do not
> bother about how to store theirs requests, how to wait for results and
> how to complete them - transactions do it for them. It is not rocket
> science, but extrmely effective and simple way to help rule out
> asynchronous machinery.
Got it. Tracking pending requests in some generic way is definitely key
to making failure handling sane with multiple servers.
> That was somewhat old approach, currently inode numbers and things like
> open-by-inode or NFS style open-by-cookie are not used. I tried to
> describe caching bits in docuementation I ent, although its a bit rough
> and likely incomplete :) Feel free to ask if there are some white areas
> there.
So what happens if the user creates a new file, and then does a stat() to
expose i_ino. Does that value change later? It's not just
open-by-inode/cookie that make ino important.
It looks like the client/server protocol is primarily path-based. What
happens if you do something like
hosta$ cd foo
hosta$ touch foo.txt
hostb$ mv foo bar
hosta$ rm foo.txt
Will hosta realize it really needs to do "unlink /bar/foo.txt"?
sage
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists