[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4d8a1e66-a7a2-49a0-be34-2b918c73f092@linux.dev>
Date: Sun, 28 Sep 2025 16:35:42 +0800
From: Youling Tang <youling.tang@...ux.dev>
To: Qu Wenruo <quwenruo.btrfs@....com>
Cc: David Sterba <dsterba@...e.com>, Josef Bacik <josef@...icpanda.com>,
Chris Mason <clm@...com>, linux-btrfs@...r.kernel.org,
linux-kernel@...r.kernel.org, Youling Tang <tangyouling@...inos.cn>
Subject: Re: [PATCH] btrfs: Add the nlink annotation in btrfs_inode_item
On 9/28/25 15:37, Qu Wenruo wrote:
>
>
> 在 2025/9/28 16:39, Youling Tang 写道:
>> On 9/28/25 13:16, Qu Wenruo wrote:
>>
>>>
>>>
>>> 在 2025/9/28 11:44, Youling Tang 写道:
>>>> Hi, Wenruo
>>>>
>>>> On 9/26/25 16:34, Qu Wenruo wrote:
>>>>>
>>>>>
>>>>> 在 2025/9/26 17:15, Youling Tang 写道:
>>>>>> From: Youling Tang <tangyouling@...inos.cn>
>>>>>>
>>>>>> When I created a directory, I found that its hard link count was
>>>>>> 1 (unlike other file system phenomena, including the "." directory,
>>>>>> which defaults to an initial count of 2).
>>>>>>
>>>>>> By analyzing the code, it is found that the nlink of the directory
>>>>>> in btrfs has always been kept at 1, which is a deliberate design.
>>>>>>
>>>>>> Adding its comments can prevent it from being mistakenly regarded
>>>>>> as a BUG.
>>>>>>
>>>>>> Signed-off-by: Youling Tang <tangyouling@...inos.cn>
>>>>>> ---
>>>>>> include/uapi/linux/btrfs_tree.h | 1 +
>>>>>> 1 file changed, 1 insertion(+)
>>>>>>
>>>>>> diff --git a/include/uapi/linux/btrfs_tree.h
>>>>>> b/include/uapi/linux/ btrfs_tree.h
>>>>>> index fc29d273845d..b4f7da90fd0e 100644
>>>>>> --- a/include/uapi/linux/btrfs_tree.h
>>>>>> +++ b/include/uapi/linux/btrfs_tree.h
>>>>>> @@ -876,6 +876,7 @@ struct btrfs_inode_item {
>>>>>> __le64 size;
>>>>>> __le64 nbytes;
>>>>>> __le64 block_group;
>>>>>> + /* nlink in directories is fixed at 1 */
>>>>>
>>>>> nlink of what?
>>>>>
>>>>> Shouldn't be "nlink of directories" or "nlink of directory inodes"?
>>>>>
>>>>>
>>>>> There are better location like btrfs-progs/Documentation/dev/On-
>>>>> disk- format.rst for this.
>>>>>
>>>>> And you're only adding one single comment for a single member?
>>>>> Even this is a different behavior compared to other fses, why not
>>>>> explain what the impact of the change?
>>>>>
>>>>>
>>>>> If you really want to add proper comments, spend more time and
>>>>> effort like commit 9c6b1c4de1c6 ("btrfs: document device locking")
>>>>> to do it correctly.
>>>>
>>>> My understanding of nlink is as follows, please correct me if I'm
>>>> wrong,
>>>>
>>>> /*
>>>> * nlink represents the hard link count (corresponds to inode-
>>>> >i_nlink value).
>>>> * For directories, this value is always 1, which differs from
>>>> other filesystems
>>>> * where a newly created directory has an inode->i_nlink value of
>>>> 2 (including
>>>> * the "." entry pointing to itself).
>>>
>>> Have you checked what's the meaning of the nlink number for other
>>> fses and why other fses go like that?
>>>
>> I have examined ext4, XFS, and bcachefs. In these filesystems,
>> when performing the following operations:
>> ```
>> # mkdir -p a/b
>> # cd a/b
>> # ls -la
>> drwxr-xr-x 2 root root 6 Sep 28 14:45 .
>> drwxr-xr-x 3 root root 15 Sep 28 14:45 ..
>> ```
>>
>> In btrfs:
>> ```
>> # ls -la
>> drwxr-xr-x 1 root root 0 Sep 28 14:48 .
>> drwxr-xr-x 1 root root 2 Sep 28 14:48 ..
>> ```
>>
>> In filesystems like ext4, we can see that the link counts for
>> directory 'a' and 'b' are 3 and 2 respectively:
>> a: The directory itself + "." pointing to itself + ".." from
>> directory b pointing to it
>> b: The directory itself + "." pointing to itself
>>
>>
>> nlink changes during directory creation in ext4:
>> ```
>> ext4_mkdir
>> ext4_init_new_dir
>> set_nlink(inode, 2) //Initial inode->i_nlink value for new
>> directory
>> ext4_inc_count(dir) //Increase parent directory's nlink by 1
>> (for "..")
>> ```
>>
>> In ext4, when the DIR_NLINK feature is enabled, if a directory's link
>> count exceeds EXT4_LINK_MAX, it will be permanently set to 1.
>>
>>
>> nlink changes during directory creation in bcachefs:
>> ```
>> bch2_mkdir
>> bch2_mknod
>> __bch2_create
>> bch2_create_trans
>> dir_u->bi_nlink++ //If creating a directory,
>> increase parent's nlink
>> bch2_inode_update_after_write
>> set_nlink(&inode->v, bch2_inode_nlink_get(bi))
>> bch2_inode_nlink_get //If directory, nlink
>> increased by 2
>> ```
>>
>>
>> In XFS, the xfs_create function contains the following comment:
>> /*
>> * A newly created regular or special file just has one directory
>> * entry pointing to them, but a directory also the "." entry
>> * pointing to itself.
>> */
>
> You didn't even understand what the nlink represents on these
> filesystems.
I understand that the nlink of a directory represents (1 + '.' + number
of subdirectories).
This was already reflected in the a/b directory example I mentioned earlier.
However, I was unaware that find uses the nlinks >= 2 scenario for
optimization purposes.
Thank you for letting me know.
Thanks,
Youling.
>
> If you even bother to check the code of find, it exactly shows the
> meaning of nlinks for directory:
>
> gl/lib/fts.c:
>
> ```
> /* Minimum link count of a traditional Unix directory. When leaf
> optimization is OK and a directory's st_nlink == MIN_DIR_NLINK,
> then the directory has no subdirectories. */
> enum { MIN_DIR_NLINK = 2 };
>
> /* Whether leaf optimization is OK for a directory. */
> enum leaf_optimization
> {
> /* st_nlink is not reliable for this directory's subdirectories. */
> NO_LEAF_OPTIMIZATION,
>
> /* st_nlink == 2 means the directory lacks subdirectories. */
> OK_LEAF_OPTIMIZATION
> };
> ```
>
>
> For filesystems returning nlinks >= 2, it means they implemented the
> optimization to indicate the number of sub-directories of it.
>
> If you didn't even get this correct, all your words are just words
> salad, no better than AI slops.
>
>>
>> Thanks,
>> Youling.
>>
>>> Especially the impact to user space tools like find?
>>>
>>>> *
>>>> * BTRFS maintains parent-child relationships through explicit
>>>> back references
>>>> * (BTRFS_INODE_REF_KEY items) rather than link count accounting.
>
> This has nothing to do with the nlink implementation of btrfs.
>
>>>> *
>>>> * This design simplifies metadata management in the copy-on-write
>>>> environment
>>>> * and enables more reliable consistency checking.
>
> All these make no sense.
>
>>>> Directory link count
>>>> * verification is performed during tree checking in
>>>> check_inode_item(), where
>>>> * values greater than 1 are treated as corruption.
>>>> *
>>>> * For regular files, nlink behaves traditionally and represents
>>>> the actual
>>>> * hard link count of the file.
>>>> */
>>>>
>>>> Thanks,
>>>> Youling.
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>
>>>>>> __le32 nlink;
>>>>>> __le32 uid;
>>>>>> __le32 gid;
>>>>>
>>>
>>
>
Powered by blists - more mailing lists