[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJxJ_jh=4q81OnSXk=yAU3u_7CCHZLGhb31eALF0cSyNv34E1g@mail.gmail.com>
Date: Mon, 14 Jul 2025 12:37:21 +0800
From: Jiany Wu <wujianyue000@...il.com>
To: "Theodore Ts'o" <tytso@....edu>
Cc: "Darrick J. Wong" <djwong@...nel.org>, yi.zhang@...wei.com, jack@...e.cz,
linux-ext4@...r.kernel.org
Subject: Re: Issue with ext4 filesystem corruption when writing to a file
after disk exhaustion
Hello, Ted,
Good day, thanks indeed for the clarification~
Yes, previously tried to mount a specific ext4 disk-img to /var/log,
with /dev/loop1 device, and rsyslogd will write to /var/log/syslog.
When /tmp directory exhaust manually via fallocate, / dir will be also
occupied as 100%, and rsyslog write errors in /dev/loop1 happen, later
mount as read-only. Different from the early scenario, but this
scenario is not easy to reproduce.
Tried updating the test case, not fallocate all spaces in disk, now
alloc 95%, everything is normal now, no related error prints anymore.
It is confirmed errors are caused by disk exhaust.
I think the main hesitation part is whether fallocate is allowed to
use the whole disk space.
root@...tbed:~$ df -Th
Filesystem Type Size Used Avail Use% Mounted on
udev devtmpfs 16G 0 16G 0% /dev
tmpfs tmpfs 3.2G 53M 3.1G 2% /run
root-overlay overlay 32G 6.2G 25G 20% /
/dev/nvme0n1p3 ext4 32G 6.2G 25G 20% /host
/dev/loop1 ext4 3.9G 189M 3.5G 6% /var/log
tmpfs tmpfs 16G 236M 16G 2% /dev/shm
tmpfs tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs tmpfs 4.0M 0 4.0M 0% /sys/fs/cgroup
root@...tbed:~$ mount | grep log
/host/disk-img/var-log.ext4 on /var/log type ext4 (rw,relatime)
root@...tbed:~$ ls -lh /host/disk-img/var-log.ext4
-rw-r--r-- 1 root root 4.0G Jul 14 07:05 /host/disk-img/var-log.ext4
root@...tbed:~$ file /host/disk-img/var-log.ext4
/host/disk-img/var-log.ext4: Linux rev 1.0 ext4 filesystem data,
UUID=49281462-eb22-4f19-8d03-51338eaf278a (needs journal recovery)
(extents) (64bit) (large files) (huge files)
# fallocate to exhaust /tmp directly
root@...tbed:~$ df /tmp
Filesystem 1K-blocks Used Available Use% Mounted on
root-overlay 229572940 229556556 0 100% /
# loop write error
testbed ERR kernel: [ 1019.470013] I/O error, dev loop1, sector 266248
op 0x1:(WRITE) flags 0x103000 phys_seg 1 prio class 2
testbed ERR kernel: [ 1019.479242] Buffer I/O error on dev loop1,
logical block 33281, lost async page write
testbed ERR kernel: [ 1009.228833] loop: Write error at byte offset
673349632, length 4096.
testbed CRIT kernel: [ 1019.487101] EXT4-fs error (device loop1):
ext4_check_bdev_write_error:217: comm rs:main Q:Reg: Error while async
write back metadata
# remounting fs as read-only
testbed ERR kernel: [ 1326.758055] Aborting journal on device loop1-8.
testbed CRIT kernel: [ 1326.765336] EXT4-fs error (device loop1):
ext4_journal_check_start:83: comm auditd: Detected aborted journal
testbed CRIT kernel: [ 1326.765960] EXT4-fs error (device loop1):
ext4_journal_check_start:83: comm rs:main Q:Reg: Detected aborted
journal
testbed CRIT kernel: [ 1326.775629] EXT4-fs (loop1): Remounting
filesystem read-only
Best regards,
Jianyue Wu
On Sat, Jul 12, 2025 at 10:34 PM Theodore Ts'o <tytso@....edu> wrote:
>
> On Fri, Jul 11, 2025 at 09:27:14PM -0700, Darrick J. Wong wrote:
> >
> > Honestly it's really too bad that there's no way for an fs to ask the
> > block device how much space it thinks is available, and then teach its
> > own statfs method to return min(fs space available, bdev space
> > availble).
> >
> > Then at least df could report that your 500T ramdisk filesystem on a 4G
> > /tmp really only has 4G of space available.
>
> I think it would be better if there was an extra field in the statfs
> structure that reported bdev space available, and have it show up
> as an extra (optional) column in the df report.
>
> The problem is that bdev space available could be highly variable.
> For example, suppose you had a few thousand users all sharing thinly
> provisioned space. If a whole bunch of users suddenly all start using
> space, the available space at the storage layer could suddenly
> plummet. And if the available space starts getting low, this might trigger
> automated, central fstrims on all of the volumes, causing the free
> space to go back up.
>
> Having the free space on a file system as reported by df go up and
> down randomly would very likely cause users to get very confused
> and upset, especially when it wasn't under their control. Even for a
> single user system the free space in tmpfs could go down suddenly when
> some huge process suddenly started, and then go up suddenly when that
> process gets OOM-killed. :-)
>
> - Ted
Powered by blists - more mailing lists