lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9630a73c-2f68-4284-97ad-f1e34a2abbdf@huaweicloud.com>
Date: Tue, 17 Dec 2024 15:05:41 +0800
From: Zhang Yi <yi.zhang@...weicloud.com>
To: Jan Kara <jack@...e.cz>
Cc: linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
 linux-kernel@...r.kernel.org, tytso@....edu, adilger.kernel@...ger.ca,
 yi.zhang@...wei.com, chengzhihao1@...wei.com, yukuai3@...wei.com,
 yangerkun@...wei.com
Subject: Re: [PATCH v4 01/10] ext4: remove writable userspace mappings before
 truncating page cache

On 2024/12/16 23:00, Jan Kara wrote:
> On Mon 16-12-24 09:39:06, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@...wei.com>
>>
>> When zeroing a range of folios on the filesystem which block size is
>> less than the page size, the file's mapped blocks within one page will
>> be marked as unwritten, we should remove writable userspace mappings to
>> ensure that ext4_page_mkwrite() can be called during subsequent write
>> access to these partial folios. Otherwise, data written by subsequent
>> mmap writes may not be saved to disk.
>>
>>  $mkfs.ext4 -b 1024 /dev/vdb
>>  $mount /dev/vdb /mnt
>>  $xfs_io -t -f -c "pwrite -S 0x58 0 4096" -c "mmap -rw 0 4096" \
>>                -c "mwrite -S 0x5a 2048 2048" -c "fzero 2048 2048" \
>>                -c "mwrite -S 0x59 2048 2048" -c "close" /mnt/foo
>>
>>  $od -Ax -t x1z /mnt/foo
>>  000000 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
>>  *
>>  000800 59 59 59 59 59 59 59 59 59 59 59 59 59 59 59 59
>>  *
>>  001000
>>
>>  $umount /mnt && mount /dev/vdb /mnt
>>  $od -Ax -t x1z /mnt/foo
>>  000000 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
>>  *
>>  000800 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>  *
>>  001000
>>
>> Fix this by introducing ext4_truncate_page_cache_block_range() to remove
>> writable userspace mappings when truncating a partial folio range.
>> Additionally, move the journal data mode-specific handlers and
>> truncate_pagecache_range() into this function, allowing it to serve as a
>> common helper that correctly manages the page cache in preparation for
>> block range manipulations.
>>
>> Signed-off-by: Zhang Yi <yi.zhang@...wei.com>
> 
> I like the patch. Feel free to add:
> 
> Reviewed-by: Jan Kara <jack@...e.cz>
> 
> Just one thing occured to me when thinking about this: It seems like a
> nasty catch that truncate_inode_pages_range() does not writeprotect these
> partial pages because practically any filesystem supporting blocksize <
> pagesize and doing anything non-trivial in ->page_mkwrite handler will need
> this. So ultimately I think we might want to fix this in generic code but
> ext4 solution is fine for now.
> 

Yes, I agree with you. I've checked XFS, Btrfs, and Bcachefs. Currently,
XFS and Btrfs choose to perform writeback before truncating partial
folios, so they do not require this at the moment. However, it appears
that Bcachefs also does a similar approach in __bch2_truncate_folio().
So I think do this in generic code should be better too.

Thanks,
Yi.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ