lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080421125358.GD9700@mit.edu>
Date:	Mon, 21 Apr 2008 08:53:58 -0400
From:	Theodore Tso <tytso@....EDU>
To:	Alexey Zaytsev <alexey.zaytsev@...il.com>
Cc:	linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	Rik van Riel <riel@...riel.com>
Subject: Re: Mentor for a GSoC application wanted (Online ext2/3 filesystem
	checker)

On Mon, Apr 21, 2008 at 04:23:42AM +0400, Alexey Zaytsev wrote:
> Not really. In my application I propose some changes to the fsck pass
> order to avoid the need to rerun it. And I don't get what dependency you
> are talking about. The only one I see is between the directory entries and
> the directory inode. Should not be hard to solve.
> (Or do I miss something? Could you give more examples maybe?)

And *this* is why I ultimately decided I didn't have the time to
mentor you.  There are large numbers of other dependencies.

For example, between the direct and indirect blocks in the inode, and
the block allocation bitmaps.  (Note that e2fsck keeps up to 3
different block bitmaps and 6 different inofr bitmaps.)  

You need to know which inodes are directories and which inodes are
regular files.  E2fsck currently keeps these bitmaps so we don't have
the cache the entire 128 byte inode for all inodes.  (Instead, we
cache a single bit for every single inode.  There's a ***reason*** for
all of these bitmaps.)

You also need to know which blocks are being used to store extended
attributes, which may potentially be shared across multiple inodes.  

That's just *three* additional dependencis, and there are many more.
If you can't think of them, how much time would it take for me as
mentor to explain all of this to you?

> >  In either case, there is still the issue of knowing exactly whether a
> >  particular read happened before or after some change in the
> >  filesystem.  This race condition is a really hard one to deal with,
> >  especially on a multiple CPU system and the filesystem checker is
> >  running in userspace.
> 
> I don't see why should fsck care about this. The notification is always sent
> after the write happened, so fsck should just re-read the data. No problem
> if it already read the (half-)updated version just before the notification.

Keep in mind that when a file gets deleted, a *large* number of
metadata blocks will potentially get updated.  So while e2fsck is
handling these reads, a bunch more can start coming in from other
filesystem transactions, and since the kernel doesn't know what
userspace has already cached, it will have to send them again... and
again...  

In fact if the filesystem is being very quickly updated, the
notifications could easily overrun whatever buffers has been set up to
transfer this information from userspace to the kernel side.  Worse
yet, unless you also send down transaction boundaries, the userspace
won't know when the filesystem has reached a "stable state" which
would be internally consistent.

There are ways that this could be solved, but at the end of the day,
the $1,000,000 question is why not just do a kernel-side snapshot?
Then you don't have to completely rewrite e2fsck --- and given that
you've claimed the e2fsck code is "hard to understand", it seems
especially audacious that you would have thought you could do this in
3 months.  If you really don't want to use LVM, you could have
proposed a snapshot solution which didn't involve devicemapper.  It's
not clear it would have entered mainline, but at least there would
have been some non-zero chance that you would complete the project
successfully.

Regards,

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ