lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 26 Sep 2006 17:16:43 -0400
From:	Theodore Tso <tytso@....edu>
To:	Andreas Dilger <adilger@...sterfs.com>
Cc:	Alexandre Ratchov <alexandre.ratchov@...l.net>,
	linux-ext4@...r.kernel.org,
	Jean-Pierre Dion <jean-pierre.dion@...l.net>
Subject: Re: [patch 04/12] rfc: 2fsprogs update

On Tue, Sep 26, 2006 at 01:54:49PM -0600, Andreas Dilger wrote:
> The only function which matches this description is block_iterate_extents()
> but it is not desired that this be a public interface.  It is just the
> "bridge" function between the block iterator and the extent iterator.

Yes, and I thought about simply pouring your extent.c file into
block.c so I could make it a static function.  That would have solved
the function, although made block.c a bit bulky.  But then when I
started thinking about the interface, I realized I wasn't completely
happy with it anyway, so I started rethinking what we wanted to
export.

> > errcode_t ext2fs_extent_iterate(ext2_filsys fs,
> > 				ext2_ino_t	ino,
> > 				int	flags,
> > 				char *block_buf,
> > 				int (*func)(ext2_filsys fs,
> > 					    struct ext2fs_extent *extent,
> > 					    void	*priv_data),
> > 				int (*meta_func)(ext2_filsys fs,
> > 						 blk64_t blk,
> > 						 int blk_type,
> > 						 char *buf,
> > 						 void	*priv_data),
> > 				void *priv_data);
> 
> Hmm, except that this interface isn't sufficient (at first glance) to
> allow full correctness checking of the extent metadata blocks.  It
> would allow checking of a given indirect/index block but not parent/child
> relationships between the index block and the extents/index it points to...

I believe that if we do breadth-first descent through the tree, and
call meta_func three times for each interior node of the tree.  So
when we hit an interior node, we call the meta_func() callback
function, and then we interate through all of the contents of the
node, calling either the func or meta_func callback as appropriate,
and then we call meta_func callback on the parent node again.  If the
child nodes are interior nodes, we then descend down the tree by
taking each interior node in turn, treating each one as a parent,
i.e., calling the meta_func() callback first, then func() or
meta_func() for all of its children, and then meta_func() callback
again.   

The blk_type would allow the callback function to determine whether it
is being called as PARENT_BEGIN_NODE, CHILD_ITERATE_NODE, or
PARENT_END_NODE.

> > This interface will work for both extent and non-extent-based
> > inodes.... that is, if this interface is called on an inode which is
> > using direct and indirect blocks, the function will Do The Right Thing
> > and find contiguous blocks runs which it will use to fill extent
> > structures that will be passed to the callback function.  This is fine,
> > since extent-based interfaces will be easier and more efficient to use
> > anyway.
> 
> Do you think this will be CPU-intensive on non-extent filesystems?

No, I don't think so.

For the application that needs to iterate over all of the blocks in
the file, whether we do it block by block, or do it inside the
block_interate_extent() function and then returning each extent one at
a time, it's the same amount of work.  In fact, by coalescing adjacent
blocks into an extent, the result is probably more cache friendly and
might actually save execution time.   

This is especially true for userspace callers, since they pretty much
all interate over the file sequentially.  The only reason why we might
not want to do this in the kernel is because it may not make sense for
random access files ---- although as we've discussed before, for
certain files, like database files, which are opened, and then
accessed repeatedly via a random access pattern, it may make sense to
figure out all of the extents in advance, store the extent tree in
memory, to accelerate random access lookups.

> > The other interface which I've started spec'ing out in my mind is a new
> > form interface and implementation for bitmaps().  The new-style bitmaps
> > will take a blk64_t type, but their biggest difference is that they will
> > allow multiple different types of interfaces, much like the io_manager
> > abstractions we have right now abstracts our I/O reoutines.  Some
> > implementations may use an extents tree to keep track of used and unused
> > bits.....
>
> It sounds desirable, but is this going to be a requirement for 1.40?  It
> seems like a lot of non-critical work at a point where you have little free
> time and there are other things awaiting the release of 1.40.

No --- but that raises the question of what we need to get into 1.40,
and what time line.  From my way of thinking of things, the big thing
we need to get into 1.40 is extent support, but I'd like to make the
interfaces be correct for 64-bits, even if we don't all get them
there.

Also, if we can get the interfaces _right_, it doesn't necessarily
have to me who does all of the coding.  Others can hopefully do some
of the coding to for example adapt the e2fsck patches to use the new
interfaces --- and if we've gotten the interfaces right, that really
shouldn't be that difficult.

As far as the non-critical work of supporting extra bitmap types, we
don't have to do it all; we just have to do enough of the framework to
support them, and then implement the basic
store-the-whole-bitmap-in-memory one to get started.  We can always do
the rest later.

Regards,

					- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ