lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZOwu2vrzX/0dX89/@dread.disaster.area>
Date:   Mon, 28 Aug 2023 15:21:30 +1000
From:   Dave Chinner <david@...morbit.com>
To:     cheng.lin130@....com.cn
Cc:     djwong@...nel.org, linux-xfs@...r.kernel.org,
        linux-kernel@...r.kernel.org, jiang.yong5@....com.cn,
        wang.liang82@....com.cn, liu.dong3@....com.cn
Subject: Re: [PATCH] xfs: introduce protection for drop nlink

On Mon, Aug 28, 2023 at 11:29:51AM +0800, cheng.lin130@....com.cn wrote:
> > On Sat, Aug 26, 2023 at 10:54:11PM +0800, cheng.lin130@....com.cn wrote:
> > > > > In the old kernel version, this situation was
> > > > > encountered, but I don't know how it happened. It was already a scene
> > > > > with directory errors: "Too many links".
> > How do you overflow the directory link count in XFS? You can't fit
> > 2^31 unique names in the directory data segment - the directory will
> > ENOSPC at 32GB of name data, and that typically occurs with at most
> > 300-500 million dirents (depending on name lengths) in the
> > directory.
> > IOWs, normal operation shouldn't be able overflow the directory link
> > count at all, and so underruns shouldn't occur, either.
> Customer's explanation: in the nlink incorrect directory, not many directories
> will be created, and normally there are only 2 regular files.
> And only found this one directory with incorrect nlink when xfs_repair.
>   systemd-fsck[5635]: Phase 2 - using internal log
>   systemd-fsck[5635]: - zero log...
>   systemd-fsck[5635]: - scan filesystem freespace and inode maps...
>   systemd-fsck[5635]: agi unlinked bucket 9 is 73 in ag 22 (inode=23622320201)

So the directory inode is on the unlinked list, as I suggested it
would be.

>   systemd-fsck[5635]: - 21:46:00: scanning filesystem freespace - 32 of 32 allocation groups done
>   systemd-fsck[5635]: - found root inode chunk
>   ...

How many other inodes were repaired or trashed or moved to
lost+found?

>   systemd-fsck[5635]: Phase 7 - verify and correct link counts...
>   systemd-fsck[5635]: resetting inode 23622320201 nlinks from 4294967284 to 2

The link count of the directory inode on the unlinked list was
actually -12, so this isn't an "off by one" error. It's still just 2
adjacent bits being cleared when they shouldn't have been, though.

What is the xfs_info (or mkfs) output for the filesystem that this
occurred on?

.....

> If it's just a incorrect count of one dicrectory, after ignore it, the fs
> can work normally(with error). Is it worth stopping the entire fs
> immediately for this condition?

The inode is on the unlinked list with a non-zero link count. That
means it cannot be removed from the unlinked list (because the inode
will not be freed during inactivation) and so the unlinked list is
effectively corrupt. Anything that removes an inode or creates a
O_TMPFILE or uses RENAME_WHITEOUT can trip over this corrupt
unlinked list and have things go from bad to worse. Hence the
corruption is not limited to the directory inode or operations
involving that directory inode. We generally shut down the
filesystem when this sort of corruption occurs - it needs to be
repaired ASAP, otherwise other stuff will randomly fail and you'll
still end up with a shut down filesystem. Better to fail fast in
corruption cases than try to ignore it and vainly hope that
everything will work out for the best....

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ