[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87bmvliwzy.fsf@esperi.org.uk>
Date: Thu, 05 Jan 2017 11:09:37 +0000
From: Nick Alcock <nick.alcock@...eri.org.uk>
To: "Darrick J. Wong" <darrick.wong@...cle.com>
Cc: "Theodore Ts'o" <tytso@....edu>,
Linux FS Maling List <linux-fsdevel@...r.kernel.org>,
daeho.jeong@...sung.com, linux-ext4 <linux-ext4@...r.kernel.org>
Subject: Re: Trouble mounting metadata_csum ext4 filesystems with v4.7.x after c9274d891869880648c4ee9365df3ecc7ba2e285: not enough inode bytes checksummed?
[Aside: This got misfiltered into a mailbox with a typo in the name,
more or less electronic oblivion. I just recovered it while preparing
to migrate to notmuch. Sorry for the huge delay.]
On 20 Sep 2016, Darrick J. Wong stated:
> [cc Ted and the ext4 list]
>
> On Mon, Sep 19, 2016 at 03:19:03PM +0100, Nix wrote:
>> We are not checksumming enough bytes! We used to checksum the entire
>> 256-byte inode: now, we checksum only 130 bytes of it, which isn't even
>> enough to cover the 28-byte extra_isize on this filesystem and is more
>> or less guaranteed to give the wrong answer. I'd fix the problem, but I
>> frankly can't see how the new code is meant to be equivalent to the old
>> code in any sense -- most particularly what the stuff around dummy_csum
>> is meant to do -- so I thought it better to let the people who wrote it
>> fix it :)
[...]
> Sh*t, I missed this during the review. So your filesystem image (thank
> you!) had this to say:
e2image is *such* a good program. :)
> debugfs> stats
> Inode size: 256
> debugfs> stat <8>
> Size of extra inode fields: 0
>
> Basically, a 128-byte inode inside a filesystem that allocated 256 bytes
> for each inode.
This was due to the change in mke2fs defaults catching me out, long ago:
I was assuming a 128-byte inode was the default, but it wasn't, or I'd
have changed things and avoided wasting the space...
I tried using inline data to save the space again (again, in the v4.7.x
period) and had a bit of a disaster a few weeks later: running dovecot's
configure with the srcdir and objdir on an inline data filesystem leads
to an oops shortly afterwards and massive data corruption on remount.
I'll whip up a replication case for this fairly soon. I suspect shared
writable mmap is being evil again.
> never bother to checksum anything after that. This is of course wrong
> since we no longer checksum the xattr space and we've deviated from the
> pre-4.7.4 (documented) on-disk format.
... and of course that'll lead to checksum failures, and thanks to
metadata checksums this is instantly obvious! :)
> *FORTUNATELY* since the root inode fails the read verifier, the file (in
> this case the root dir) can't be modified and therefore nothing has been
> corrupted. Especially fortunate for you, the fs won't mount, reducing
> the chances that any serious damage has occurred.
Even if it had, nothing too bad would result. There's a reason I do
hourly-and-daily backups, and keep new features switched well and truly
off on the backup filesystem!
> I /think/ the fix in this case is to hoist the last ext4_chksum call
> out of the EXT4_FITS_IN_INODE blob:
I'll give that a try this weekend. Sorry for the huge delay!
>> ... The mystery is why this isn't going wrong anywhere else: I have
>> metadata_csum working on every fs on a bunch of other ext4-using
>> systems, and indeed on every filesystem on this machine as well, as long
>> as c9274d8 is not applied. Many of them are similarly upgraded pre-csum
>> fses with the same inode size and extra_isize, but they work...
>
> Hard to say, but this bug only affects inodes with 0 < i_extra_size <= 4
> i.e. 128 < inode-core-size <= 130. Maybe an old ext3 fs that was
> created with 256 bytes per inode but inode-core-size of 128?
It was ext4 from the start, created with -i 128, inode_size = 256.
> Uck. Sorry about this mess.
Sorry I overlooked your rapid response for so long!
--
NULL && (void)
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists