linux-ext4 - Re: JBD2: Spotted dirty metadata buffer....

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <77A4A0E4-C85C-4A9A-AB20-78754F95FEEC@dilger.ca>
Date:   Wed, 6 Sep 2017 12:07:11 -0600
From:   Andreas Dilger <adilger@...ger.ca>
To:     Wolfgang Walter <linux@...m.de>, Tao Ma <boyu.mt@...bao.com>
Cc:     Ext4 Developers List <linux-ext4@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: JBD2: Spotted dirty metadata buffer....

On Sep 6, 2017, at 6:46 AM, Wolfgang Walter <linux@...m.de> wrote:
> 
> Am Montag, 28. November 2016, 12:26:38 schrieb Wolfgang Walter:
>> Am Mittwoch, 23. November 2016, 16:40:07 schrieb Andreas Dilger:
>>> Stepping back a bit - does this problem only happen with an external
>>> journal device, or does it also happen with an internal journal?
>> 
>> So I tried that this weekend. I got again these messages
>> 
>> JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 241763277). There's a risk of filesystem corruption in case of system crash.
>> 
>> So this also happens with an internal journal.
>> 
> 
> [snip]
> 
> I last tried with 4.9.46 and I still see that problem when rsyncing data to the filesystem: errors similar to
> 
> JBD2: Spotted dirty metadata buffer (dev = dm-25, blocknr = 1008028301). There's a risk of filesystem corruption in case of system crash.
> 
> A later filesystem check does not show any errors.
> 
> 
> With 4.9.46 stable kernels I also sometimes get the following error:
> 
> EXT4-fs error (device dm-25): ext4_iget:4501: inode #74061557: comm rsync: checksum invalid
> or
> EXT4-fs error (device dm-25): ext4_iget:4501: inode #155844677: comm nfsd: checksum invalid
> 
> A filesystem check then says that the inode itselfs seems ok but the checksum is indeed wrong.
> 
> As these inodes are inodes of very small files.
> 
> So I finally copied all away and reinitialized the filesystem. But this time without -O inline_data
> 
> Since then all works fine. So I assume there is a proplem with inodes and inline-data (at least until 4.9.46), maybe only with data=journal.

Add Tao Ma, since he is the original inline data author, and it appears that Taobao
is using this feature so are probably interested in fixing any corruption.

Wolfgang, it would be useful if you could figure out what type of block the reported
blocknr=1008028301 is.  You can use "debugfs -c -R 'icheck 1008028301' /dev/dm-25".
I'd suspect, given that this is an inline data problem, that this is an inode table
block, but it is good to be sure.  It would also be useful to see if this inode
number correlates to one of the inodes that have bad checksums.

Cheers, Andreas






Download attachment "signature.asc" of type "application/pgp-signature" (196 bytes)