[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4829E752.8030104@garzik.org>
Date: Tue, 13 May 2008 15:09:06 -0400
From: Jeff Garzik <jeff@...zik.org>
To: Evgeniy Polyakov <johnpol@....mipt.ru>
CC: linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
linux-fsdevel@...r.kernel.org
Subject: Re: POHMELFS high performance network filesystem. Transactions, failover,
performance.
Evgeniy Polyakov wrote:
> Hi.
>
> I'm please to announce POHMEL high performance network filesystem.
> POHMELFS stands for Parallel Optimized Host Message Exchange Layered File System.
>
> Development status can be tracked in filesystem section [1].
>
> This is a high performance network filesystem with local coherent cache of data
> and metadata. Its main goal is distributed parallel processing of data. Network
> filesystem is a client transport. POHMELFS protocol was proven to be superior to
> NFS in lots (if not all, then it is in a roadmap) operations.
>
> This release brings following features:
> * Fast transactions. System will wrap all writings into transactions, which
> will be resent to different (or the same) server in case of failure.
> Details in notes [1].
> * Failover. It is now possible to provide number of servers to be used in
> round-robin fasion when one of them dies. System will automatically
> reconnect to others and send transactions to them.
> * Performance. Super fast (close to wire limit) metadata operations over
> the network. By courtesy of writeback cache and transactions the whole
> kernel archive can be untarred by 2-3 seconds (including sync) over
> GigE link (wire limit! Not comparable to NFS).
>
> Basic POHMELFS features:
> * Local coherent (notes [5]) cache for data and metadata.
> * Completely async processing of all events (hard and symlinks are the only
> exceptions) including object creation and data reading.
> * Flexible object architecture optimized for network processing. Ability to
> create long pathes to object and remove arbitrary huge directoris in
> single network command.
> * High performance is one of the main design goals.
> * Very fast and scalable multithreaded userspace server. Being in userspace
> it works with any underlying filesystem and still is much faster than
> async ni-kernel NFS one.
>
> Roadmap includes:
> * Server extension to allow storing data on multiple devices (like creating mirroring),
> first by saving data in several local directories (think about server, which mounted
> remote dirs over POHMELFS or NFS, and local dirs).
> * Client/server extension to report lookup and readdir requests not only for local
> destination, but also to different addresses, so that reading/writing could be
> done from different nodes in parallel.
> * Strong authentification and possible data encryption in network channel.
> * Async writing of the data from receiving kernel thread into
> userspace pages via copy_to_user() (check development tracking
> blog for results).
>
> One can grab sources from archive or git [2] or check homepage [3].
> Benchmark section can be found in the blog [4].
>
> The nearest roadmap (scheduled or the end of the month) includes:
> * Full transaction support for all operations (only writeback is
> guarded by transactions currently, default network state
> just reconnects to the same server).
> * Data and metadata coherency extensions (in addition to existing
> commented object creation/removal messages). (next week)
> * Server redundancy.
This continues to be a neat and interesting project :)
Where is the best place to look at client<->server protocol?
Are you planning to support the case where the server filesystem dataset
does not fit entirely on one server?
What is your opinion of the Paxos algorithm?
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists