linux-kernel - Re: [RFC PATCH v4 8/8] xfs: improve truncate on a realtime inode with huge extsize

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7b7d8062-65c3-9659-230a-bc8dea4785f6@huaweicloud.com>
Date: Tue, 4 Jun 2024 15:09:01 +0800
From: Zhang Yi <yi.zhang@...weicloud.com>
To: "Darrick J. Wong" <djwong@...nel.org>,
 Christoph Hellwig <hch@...radead.org>
Cc: linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
 linux-kernel@...r.kernel.org, brauner@...nel.org, david@...morbit.com,
 chandanbabu@...nel.org, jack@...e.cz, willy@...radead.org,
 yi.zhang@...wei.com, chengzhihao1@...wei.com, yukuai3@...wei.com
Subject: Re: [RFC PATCH v4 8/8] xfs: improve truncate on a realtime inode with
 huge extsize

On 2024/5/31 23:00, Darrick J. Wong wrote:
> On Fri, May 31, 2024 at 07:15:34AM -0700, Christoph Hellwig wrote:
>> On Fri, May 31, 2024 at 07:12:10AM -0700, Darrick J. Wong wrote:
>>> There are <cough> some users that want 1G extents.
>>>
>>> For the rest of us who don't live in the stratosphere, it's convenient
>>> for fsdax to have rt extents that match the PMD size, which could be
>>> large on arm64 (e.g. 512M, or two smr sectors).
>>
>> That's fine.  Maybe to rephrase my question.  With this series we
>> have 3 different truncate path:
>>
>>  1) unmap all blocks (!rt || rtextsizse == 1)
>>  2) zero leftover blocks in an rtextent (small rtextsize, but > 1)
>>  3) converted leftover block in an rtextent to unwritten (large
>>    rtextsize)
>>
>> What is the right threshold to switch between 2 and 3?  And do we
>> really need 2) at all?
> 
> I don't think we need (2) at all.
> 
> There's likely some threshold below where it's a wash -- compare with
> ext4 strategy of trying to write 64k chunks even if that requires
> zeroing pagecache to cut down on fragmentation on hdds -- but I don't
> know if we care anymore. ;)
> 

I supplemented some tests for small > 1 rtextsizes on my ramdisk,

  mkfs.xfs -f -m reflink=0,rmapbt=0, -d rtinherit=1 \
           -r rtdev=/dev/pmem1s,extsize=$rtextsize /dev/pmem2s
  mount -ortdev=/dev/pmem1s /dev/pmem2s /mnt/scratch
  for i in {1..1000}; \
  do dd if=/dev/zero of=/mnt/scratch/$i bs=$rtextsize count=1; done
  sync
  time for i in {1..1000}; \
  do xfs_io -c "truncate 4k" /mnt/scratch/$i; done

rtextsize            8k      16k      32k      64k     256k     1024k
zero out:          9.601s  10.229s  11.153s  12.086s  12.259s  20.141s
convert unwritten: 9.710s   9.642s   9.958s   9.441s  10.021s  10.526s

The test showed that there is no much difference between (2) and (3)
with small rtextsize, but if the size gets progressively larger, (3)
will be better, so I agree with you that we could just drop (2) for
rt device.

Thanks,
Yi.