[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <A8B216D0-06C5-4D85-9AA6-EB4C9E87D26B@oracle.com>
Date: Thu, 17 Jun 2010 19:41:25 -0600
From: Andreas Dilger <andreas.dilger@...cle.com>
To: "J. R. Okajima" <hooanon05@...oo.co.jp>
Cc: David Dillow <dillowda@...l.gov>,
Valerie Aurora <vaurora@...hat.com>,
Alexander Viro <viro@...iv.linux.org.uk>,
Christoph Hellwig <hch@...radead.org>,
Miklos Szeredi <miklos@...redi.hu>,
Jan Blunck <jblunck@...e.de>,
Jamie Lokier <jamie@...reable.org>,
David Woodhouse <dwmw2@...radead.org>,
Arnd Bergmann <arnd@...db.de>,
Andreas Dilger <adilger@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH v2] d_ino considered harmful
On 2010-06-17, at 12:04, "J. R. Okajima" <hooanon05@...oo.co.jp> wrote:
>
> I am interested in this simplified
> problem such as "find the pathname(s) from an inum in a huge fs."
> Is ne2scan essentially equivalent to "debugfs ncheck inum"?
The (n)e2scan program is essentially just an optimized ext3 inode
table scanner we wrote for Lustre that walks the inode table in order,
and optimistically reads directory inode blocks (in disk offset order)
and matches the inode numbers to an icrementally-build tree of parent
directories when the directory entries appear. Since the most common
case is that parent has a lower inode number than the subdirectories
there is rarely a need to keep whole subdirectories in memory. This
is fairly efficient when dumping the whole Filesystem, since it makes
a single pass over the metadata, though it is inefficient when doing a
small subset of the filesystem.
As the name implies, it is very extN specific. For Lustre 2.0 we use a
different method to get O(1) FID (inode number) to pathname(s)
lookup. Each file stores an xattr with the {parent FID, filename}
tuples for each link to the file, whenever an inode is created,
linked, unlinked, or renamed.
In the common case, storing the filename and parent FID adds no
overhead to these operations since the inode needs to be written to
update the nlink count anyway, and the xattr can be stored in the
inode and does not generate extra IO unless there are more hard links
than can fit in the inode.
This allows doing optimized pathname generation for all links to a
file, and can in theory be used for any type of filesystem that has
efficient xattr storage.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists