linux-ext4 - Re: Issue with ext4 filesystem corruption when writing to a file after disk exhaustion

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250711052905.GC2026761@mit.edu>
Date: Fri, 11 Jul 2025 01:29:05 -0400
From: "Theodore Ts'o" <tytso@....edu>
To: Jiany Wu <wujianyue000@...il.com>
Cc: yi.zhang@...wei.com, jack@...e.cz, linux-ext4@...r.kernel.org
Subject: Re: Issue with ext4 filesystem corruption when writing to a file
 after disk exhaustion

On Fri, Jul 11, 2025 at 11:20:32AM +0800, Jiany Wu wrote:
> Hello,
> 
> Recently I encountered an issue in kernel 6.1.123, when writing to a
> file after disk exhaustion, it will report EFSCORRUPTED. I think it is
> un-expected behavior.

What you did was created a file system in /tmp/mydisk by creating a
sparse image file:

> root@...tbed:/tmp# touch mydisk
> root@...tbed:/tmp# ls -l mydisk
> -rw-r--r-- 1 root root 0 Jul  8 05:36 mydisk
> root@...tbed:/tmp# truncate -s 128M mydisk
> root@...tbed:/tmp# mkfs.ext4 mydisk

The potential problem is this assumes that /tmp had enough space to
write 128M of space.  But it's clear that it didn't have enough space.
Do not only did you exhaust the space in the file system, you *also*
exhausted space in /tmp.  You can see this because of the I/O errors
when writing to /dev/loop2:

> root@...tbed:/tmp# mount mydisk /mnt/test_fs/
> root@...tbed:/tmp# findmnt /mnt/test_fs
> TARGET       SOURCE     FSTYPE OPTIONS
> /mnt/test_fs /dev/loop2 ext4   rw,relatime
> ...
> root@...tbed:/mnt/test_fs# fallocate -l 32716560K /mnt/test_fs/test_file
> fallocate: fallocate failed: No space left on device
> root@...tbed:/mnt/test_fs# journalctl -f
> Jul 08 05:43:07 testbed kernel: loop: Write error at byte offset
> 9178112, length 1024.
> Jul 08 05:43:07 testbed kernel: loop: Write error at byte offset
> 274432, length 1024.

These error messages are write errors in /dev/loop2, which were almost
certainly caused by ENOSPC errors when trying to write to /tmp/mydisk.

This is the moral equivalent of buying a fradulent USB thumb drive
from the back alleys of Shenzhen, where the USB thumb drive was
*labelled* as having 128MB of storage, but which only had 16MB of
flash, such that writes after the first 16MB would fail (or overwrite
other disk blocks).

If /tmp had enough space, then you wouldn't have see these errors.

One alternative way you could create the image would have been to replace 

> root@...tbed:/tmp# touch mydisk
> root@...tbed:/tmp# ls -l mydisk
> -rw-r--r-- 1 root root 0 Jul  8 05:36 mydisk
> root@...tbed:/tmp# truncate -s 128M mydisk

with:

root@...tbed:/tmp# dd if=/dev/zero of=mydisk bs=1M count=128

This allocates 128MB to /tmp/mydisk, and if there isn't enough space
in /tmp, the dd will fail with an error.  If it succeeds, then when
you create the file system and mount it, you won't see the error
messages writing to /dev/loopN.

The bottom line is that the bug is a PEBCAK ("probem exists between
chair and keyboard") which is another way of saying, it's a failure in
the system admisitrator not understanding that they had done something
bad.  It is not a kernel bug, but rather a bug in your procedure /
system setup.

Cheers,

						- Ted