linux-ext4 - Re: Corrupted i

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <7669753e-ec6c-421a-a132-3ae00b3b3db9@dybdal.dk>
Date: Tue, 1 Oct 2024 00:17:51 +0200
From: Jesper Dybdal <jd-ext4@...dal.dk>
To: linux-ext4@...r.kernel.org
Cc: Andreas Dilger <adilger@...ger.ca>
Subject: Re: Corrupted i_blocks field

On 2024-09-30 22:29, Andreas Dilger wrote:
> On Sep 27, 2024, at 8:38 AM, Jesper Dybdal<jd-ext4@...dal.dk>  wrote:
>> I have now a few times experienced a problem with the i_blocks field of a few inodes being corrupted (replaced by extremely large numbers).
>>
>> I don't believe that it is a disk error - the file system is on a RAID1 partition and the RAID consistency is checked regularly.
>> I also find it hard to believe that it is a RAM error - the machine has run memtest86+ overnight without finding anything.
>>
>> The files I've seen corrupted are simple small text files that are modified only using an ordinary text editor (emacs).
>>
>> Fsck fixes it.
>> The system is an up-to-date Debian Bookworm:
>>      Linux nuser 6.1.0-25-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.106-3 (2024-08-26) x86_64 GNU/Linux
>>
>> I do one thing that is not the default for ext4: I use the "nodelalloc" option (because several years ago, there was a discussion about "delalloc or not" from which I got the impression that nodelalloc was probably slightly safer - if the resulting performance reduction is not a problem, which it is not for me):
>>       /dev/md0 on / type ext4 (rw,relatime,nodelalloc,errors=remount-ro)
>>
>> Three examples follow below.  Note that the bad field values, when interpreted as 48-bit signed numbers, are numerically small negative numbers (-25, -9, -3, respectively).
>>
>> Excerpts from the fsck logs:
>> root: Inode 10748715, i_blocks is 281474976710631, should be 5. FIXED.
>> root: Inode 10751288, i_blocks is 281474976710647, should be 3. FIXED.
>> root: Inode 10748542, i_blocks is 281474976710653, should be 1. FIXED.
>>
>> I don't know when the first two of these corruptions occurred, but the last one happened yesterday or the day before.  The file in question was /etc/fstab, and I discovered the problem after I had edited fstab on Wednesday and rebooted on Thursday.
>>
>> The corrupted files can be read and copied without problems.  I have not dared to delete any of those files before fsck had fixed them.
>>
>> What is going on here?
> This looks like an underflow of the used blocks count on the inode:
>
>      281474976710631 = 0xffffffffffe7
>      281474976710647 = 0xfffffffffff7
>      281474976710653 = 0xfffffffffffd
>
> This is 2^48 blocks, which is the limit for the number of blocks that fit
> into the available inode fields (32-bit i_blocks_lo, 16-bit i_blocks_hi).
>
> There is likely some kind of accounting error in the code.  Is anything
> unusual with access patterns for those files (large xattrs/ACLs, are they
> files or directories or special files. mmap, truncate, fallocate, etc.)?
No.  They are all simple small text configuration files, and I edit them 
using Emacs.  The only slightly unusual thing is, as I wrote earlier, 
that the file system is mounted with the nodelalloc option.

The files I have identified are fstab and  two postfix configuration 
files: /etc/postfix/{main.cf,master.cf} .  The problem has actually hit 
master.cf twice.

I have verified that the only reboot that happened between the fstab 
edit on Wednesday and  seeing the problem Thursday, was a clean 
deliberate reboot - no power outage of similar.
> If you are able to reproduce with the /etc/fstab editing, possibly strace
> could help to identify if something unusual is being done to the file.
I'll try, but I do not really expect Emacs to do strange things to the file
> Cheers, Andreas

Thanks,
Jesper

-- 
Jesper Dybdal
https://www.dybdal.dk