[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALf2hKsMc3o+mYg2xwNEFO+q2Z=XteOmCjd1=EHOR0Na3=201Q@mail.gmail.com>
Date: Tue, 30 Dec 2025 15:37:53 +0800
From: Zhiyu Zhang <zhiyuzhang999@...il.com>
To: viro@...iv.linux.org.uk, brauner@...nel.org, Jan Kara <jack@...e.cz>,
hirofumi@...l.parknet.co.jp, linux-fsdevel@...r.kernel.org,
syzkaller <syzkaller@...glegroups.com>, linux-kernel@...r.kernel.org
Subject: Re: [Kernel Bug] WARNING in vfat_rmdir
Dear Linux kernel developers and maintainers,
I’m sorry — the root cause analysis in my previous report was likely
incorrect, and the patch I suggested there does not alleviate the
issue after further testing, which means that the root cause is not on
the errno passing.
After adding debug prints in fat__get_entry(), I observed frequent
cases of err=0, phys=0 at pos == i_size, which means the code is
taking a "normal EOF" path (as decided by fat_bmap()) rather than
hitting and swallowing a negative errno. As a result,
fat_get_short_entry() still returns -ENOENT, fat_dir_empty() still
returns 0, and the code path does not prevent drop_nlink(dir) from
being executed even when the parent directory's i_nlink is already
abnormal. In other words, the parent directory's i_nlink appears to be
wrong/corrupted in the first place. Subsequent vfat_rmdir() calls then
decrement the already-too-low link count, eventually reaching 0 and
triggering WARN_ON(inode->i_nlink == 0) in drop_nlink() (and panicking
if panic_on_warn is enabled).
So please IGNORE my previous patch proposal. A conservative mitigation
that I tested EFFECTIVE is similar to the UDF fix for corrupted parent
link count handling (Jan Kara's WARNING in udf_rmdir fix:
"https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c5566903af56dd1abb092f18dcb0c770d6cd8dcb").
static int vfat_rmdir(struct inode *dir, struct dentry *dentry)
err = fat_remove_entries(dir, &sinfo); /* and releases bh */
if (err)
goto out;
- drop_nlink(dir);
+ if (dir->i_nlink >= 3)
+ drop_nlink(dir);
+ else
+ fat_fs_error_ratelimit(sb, "parent dir link count too
low (%u)\n",
+ dir->i_nlink);
clear_nlink(inode);
fat_truncate_time(inode, NULL, S_ATIME|S_MTIME);
Given that the Syz reproducer (part) looks like:
syz_mount_image$vfat(&(0x7f0000000080), &(0x7f0000001240)='./bus\x00',
0xa00400, &(0x7f0000000000)=ANY=[@ANYRES64=0x0], 0x1, 0x1230,
&(0x7f00000024c0)="$XXX...")
r1 = openat(0xffffffffffffff9c, &(0x7f0000000040)='.\x00', 0x0, 0x0)
mkdirat(r1, &(0x7f0000000180)='./bus\x00', 0x0)
mkdirat(r1, &(0x7f0000000100)='./file0/file0\x00', 0x0) (rerun: 64)
unlinkat(r1, &(0x7f00000001c0)='./bus/file0\x00', 0x200) (rerun: 64)
I'm still debugging why, with the corrupted image, creation of
./file0/file0 seems to end up writing into bus's directory data blocks
(cluster sharing), so bus's on-disk directory content appears to
contain a subdir entry while bus->i_nlink remains 2.
Best,
Zhiyu Zhang
Zhiyu Zhang <zhiyuzhang999@...il.com> 于2025年12月30日周二 02:18写道:
>
> Dear Linux kernel developers and maintainers,
>
> We would like to report a filesystem corruption triggered bug that
> causes a WARNING in drop_nlink() from the VFAT rmdir path, and leads
> to a kernel panic when panic_on_warn is enabled. The bug titled
> "WARNING in vfat_rmdir" was found on linux-6.17.1 and is also
> reproducible on the latest 6.19-rc3.
>
> The possible root cause is that the FAT directory iteration helpers
> conflate real errors with "end of directory" in a way that hides
> corruption from higher layers. Concretely, fat__get_entry() returns -1
> both when it reaches EOF (!phys) and when fat_bmap() fails due to a
> corrupted cluster chain (err != 0). Then fat_get_short_entry() treats
> any < 0 from fat_get_entry() as "no more entries" and returns -ENOENT.
> As a result, callers such as fat_dir_empty() and fat_subdirs() cannot
> distinguish a genuinely empty directory from a directory walk that
> terminates early due to corruption. In this situation, fat_dir_empty()
> may incorrectly return success (empty), allowing vfat_rmdir() to
> proceed with metadata updates, including drop_nlink(dir). Separately,
> fat_subdirs() may silently "succeed" with an incorrect count (e.g., 0)
> when the walk is cut short, which can further poison in-memory link
> counts when inodes are built from corrupted on-disk state. Eventually,
> the VFAT rmdir path can reach drop_nlink() with an already-zero
> i_nlink, triggering WARN_ON(inode->i_nlink == 0) and panicking under
> panic_on_warn.
>
> This bug may lead to denial-of-service on systems that enable
> panic_on_warn, and more broadly to inconsistent in-memory metadata
> updates when operating on corrupted VFAT images.
>
> We suggest the following potential patch:
> (1) Propagate real errors from the directory iteration path instead of
> folding them into -ENOENT and make fat_get_short_entry() translate
> only true EOF into -ENOENT while propagating other negative errors.
> (2) Update fat_dir_empty() / fat_subdirs() to treat propagated errors
> as failures rather than "empty" / a weird count, and handle negative
> returns at their call sites.
>
> diff --git a/fs/fat/dir.c b/fs/fat/dir.c
> index 92b091783966..f4c5a6f0cc84 100644
> --- a/fs/fat/dir.c
> +++ b/fs/fat/dir.c
> @@ -92,8 +92,10 @@ static int fat__get_entry(struct inode *dir, loff_t *pos,
> *bh = NULL;
> iblock = *pos >> sb->s_blocksize_bits;
> err = fat_bmap(dir, iblock, &phys, &mapped_blocks, 0, false);
> - if (err || !phys)
> - return -1; /* beyond EOF or error */
> + if (err)
> + return err; /* real error (e.g., -EIO, -EUCLEAN) */
> + if (!phys)
> + return -1; /* beyond EOF */
>
> fat_dir_readahead(dir, iblock, phys);
>
> @@ -882,12 +884,14 @@ static int fat_get_short_entry(struct inode
> *dir, loff_t *pos,
> struct buffer_head **bh,
> struct msdos_dir_entry **de)
> {
> - while (fat_get_entry(dir, pos, bh, de) >= 0) {
> + int err;
> + while ((err = fat_get_entry(dir, pos, bh, de)) >= 0) {
> /* free entry or long name entry or volume label */
> if (!IS_FREE((*de)->name) && !((*de)->attr & ATTR_VOLUME))
> return 0;
> }
> - return -ENOENT;
> + /* -1 is EOF sentinel; propagate other errors */
> + return (err == -1) ? -ENOENT : err;
> }
>
> /*
> @@ -919,11 +923,11 @@ int fat_dir_empty(struct inode *dir)
> struct buffer_head *bh;
> struct msdos_dir_entry *de;
> loff_t cpos;
> - int result = 0;
> + int result = 0, err;
>
> bh = NULL;
> cpos = 0;
> - while (fat_get_short_entry(dir, &cpos, &bh, &de) >= 0) {
> + while ((err = fat_get_short_entry(dir, &cpos, &bh, &de)) >= 0) {
> if (strncmp(de->name, MSDOS_DOT , MSDOS_NAME) &&
> strncmp(de->name, MSDOS_DOTDOT, MSDOS_NAME)) {
> result = -ENOTEMPTY;
> @@ -931,6 +935,8 @@ int fat_dir_empty(struct inode *dir)
> }
> }
> brelse(bh);
> + if (err < 0 && err != -ENOENT)
> + return err;
> return result;
> }
> EXPORT_SYMBOL_GPL(fat_dir_empty);
> @@ -944,15 +950,17 @@ int fat_subdirs(struct inode *dir)
> struct buffer_head *bh;
> struct msdos_dir_entry *de;
> loff_t cpos;
> - int count = 0;
> + int count = 0, err;
>
> bh = NULL;
> cpos = 0;
> - while (fat_get_short_entry(dir, &cpos, &bh, &de) >= 0) {
> + while ((err = fat_get_short_entry(dir, &cpos, &bh, &de)) >= 0) {
> if (de->attr & ATTR_DIR)
> count++;
> }
> brelse(bh);
> + if (err < 0 && err != -ENOENT)
> + return err;
> return count;
> }
>
> diff --git a/fs/fat/inode.c b/fs/fat/inode.c
> index 0b6009cd1844..36ec8901253e 100644
> --- a/fs/fat/inode.c
> +++ b/fs/fat/inode.c
> @@ -535,7 +535,10 @@ int fat_fill_inode(struct inode *inode, struct
> msdos_dir_entry *de)
> return error;
> MSDOS_I(inode)->mmu_private = inode->i_size;
>
> - set_nlink(inode, fat_subdirs(inode));
> + int nsubs = fat_subdirs(inode);
> + if (nsubs < 0)
> + return nsubs;
> + set_nlink(inode, nsubs);
>
> error = fat_validate_dir(inode);
> if (error < 0)
> @@ -1345,7 +1348,10 @@ static int fat_read_root(struct inode *inode)
> fat_save_attrs(inode, ATTR_DIR);
> inode_set_mtime_to_ts(inode,
> inode_set_atime_to_ts(inode,
> inode_set_ctime(inode, 0, 0)));
> - set_nlink(inode, fat_subdirs(inode)+2);
> + int nsubs = fat_subdirs(inode);
> + if (nsubs < 0)
> + return nsubs;
> + set_nlink(inode, nsubs+2);
>
> return 0;
> }
>
> If the approach above is acceptable, we are willing to submit a proper
> patch immediately. Please let us know if any further information is
> required.
>
> Best regards,
> Zhiyu Zhang
Powered by blists - more mailing lists