[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9630a73c-2f68-4284-97ad-f1e34a2abbdf@huaweicloud.com>
Date: Tue, 17 Dec 2024 15:05:41 +0800
From: Zhang Yi <yi.zhang@...weicloud.com>
To: Jan Kara <jack@...e.cz>
Cc: linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, tytso@....edu, adilger.kernel@...ger.ca,
yi.zhang@...wei.com, chengzhihao1@...wei.com, yukuai3@...wei.com,
yangerkun@...wei.com
Subject: Re: [PATCH v4 01/10] ext4: remove writable userspace mappings before
truncating page cache
On 2024/12/16 23:00, Jan Kara wrote:
> On Mon 16-12-24 09:39:06, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@...wei.com>
>>
>> When zeroing a range of folios on the filesystem which block size is
>> less than the page size, the file's mapped blocks within one page will
>> be marked as unwritten, we should remove writable userspace mappings to
>> ensure that ext4_page_mkwrite() can be called during subsequent write
>> access to these partial folios. Otherwise, data written by subsequent
>> mmap writes may not be saved to disk.
>>
>> $mkfs.ext4 -b 1024 /dev/vdb
>> $mount /dev/vdb /mnt
>> $xfs_io -t -f -c "pwrite -S 0x58 0 4096" -c "mmap -rw 0 4096" \
>> -c "mwrite -S 0x5a 2048 2048" -c "fzero 2048 2048" \
>> -c "mwrite -S 0x59 2048 2048" -c "close" /mnt/foo
>>
>> $od -Ax -t x1z /mnt/foo
>> 000000 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
>> *
>> 000800 59 59 59 59 59 59 59 59 59 59 59 59 59 59 59 59
>> *
>> 001000
>>
>> $umount /mnt && mount /dev/vdb /mnt
>> $od -Ax -t x1z /mnt/foo
>> 000000 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
>> *
>> 000800 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> *
>> 001000
>>
>> Fix this by introducing ext4_truncate_page_cache_block_range() to remove
>> writable userspace mappings when truncating a partial folio range.
>> Additionally, move the journal data mode-specific handlers and
>> truncate_pagecache_range() into this function, allowing it to serve as a
>> common helper that correctly manages the page cache in preparation for
>> block range manipulations.
>>
>> Signed-off-by: Zhang Yi <yi.zhang@...wei.com>
>
> I like the patch. Feel free to add:
>
> Reviewed-by: Jan Kara <jack@...e.cz>
>
> Just one thing occured to me when thinking about this: It seems like a
> nasty catch that truncate_inode_pages_range() does not writeprotect these
> partial pages because practically any filesystem supporting blocksize <
> pagesize and doing anything non-trivial in ->page_mkwrite handler will need
> this. So ultimately I think we might want to fix this in generic code but
> ext4 solution is fine for now.
>
Yes, I agree with you. I've checked XFS, Btrfs, and Bcachefs. Currently,
XFS and Btrfs choose to perform writeback before truncating partial
folios, so they do not require this at the moment. However, it appears
that Bcachefs also does a similar approach in __bch2_truncate_folio().
So I think do this in generic code should be better too.
Thanks,
Yi.
Powered by blists - more mailing lists