[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <30419139-6fdd-48df-b32a-9db7575cebb8@kernel.org>
Date: Tue, 26 Mar 2024 11:39:33 +0800
From: Chao Yu <chao@...nel.org>
To: Daeho Jeong <daeho43@...il.com>
Cc: linux-kernel@...r.kernel.org, linux-f2fs-devel@...ts.sourceforge.net,
kernel-team@...roid.com, Daeho Jeong <daehojeong@...gle.com>
Subject: Re: [f2fs-dev] [PATCH v3] f2fs: prevent writing without fallocate()
for pinned files
On 2024/3/25 23:02, Daeho Jeong wrote:
> On Fri, Mar 22, 2024 at 9:26 PM Chao Yu <chao@...nel.org> wrote:
>>
>> On 2024/3/21 1:42, Daeho Jeong wrote:
>>> On Wed, Mar 20, 2024 at 2:38 AM Chao Yu <chao@...nel.org> wrote:
>>>>
>>>> On 2024/3/20 5:23, Daeho Jeong wrote:
>>>>> From: Daeho Jeong <daehojeong@...gle.com>
>>>>>
>>>>> In a case writing without fallocate(), we can't guarantee it's allocated
>>>>> in the conventional area for zoned stroage.
>>>>>
>>>>> Signed-off-by: Daeho Jeong <daehojeong@...gle.com>
>>>>> ---
>>>>> v2: covered the direct io case
>>>>> v3: covered the mkwrite case
>>>>> ---
>>>>> fs/f2fs/data.c | 14 ++++++++++++--
>>>>> fs/f2fs/file.c | 16 ++++++++--------
>>>>> 2 files changed, 20 insertions(+), 10 deletions(-)
>>>>>
>>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>>>> index c21b92f18463..d3e5ab2736a6 100644
>>>>> --- a/fs/f2fs/data.c
>>>>> +++ b/fs/f2fs/data.c
>>>>> @@ -1584,8 +1584,11 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, int flag)
>>>>>
>>>>> /* use out-place-update for direct IO under LFS mode */
>>>>> if (map->m_may_create &&
>>>>> - (is_hole || (f2fs_lfs_mode(sbi) && flag == F2FS_GET_BLOCK_DIO))) {
>>>>> - if (unlikely(f2fs_cp_error(sbi))) {
>>>>> + (is_hole || (f2fs_lfs_mode(sbi) && flag == F2FS_GET_BLOCK_DIO &&
>>>>> + !f2fs_is_pinned_file(inode)))) {
>>>>> + if (unlikely(f2fs_cp_error(sbi)) ||
>>>>> + (f2fs_is_pinned_file(inode) && is_hole &&
>>>>> + flag != F2FS_GET_BLOCK_PRE_DIO)) {
>>>>> err = -EIO;
>>>>> goto sync_out;
>>>>> }
>>>>> @@ -3378,6 +3381,8 @@ static int prepare_write_begin(struct f2fs_sb_info *sbi,
>>>>> f2fs_map_lock(sbi, flag);
>>>>> locked = true;
>>>>> } else if ((pos & PAGE_MASK) >= i_size_read(inode)) {
>>>>> + if (f2fs_is_pinned_file(inode))
>>>>> + return -EIO;
>>>>> f2fs_map_lock(sbi, flag);
>>>>> locked = true;
>>>>> }
>>>>> @@ -3407,6 +3412,11 @@ static int prepare_write_begin(struct f2fs_sb_info *sbi,
>>>>>
>>>>> if (!f2fs_lookup_read_extent_cache_block(inode, index,
>>>>> &dn.data_blkaddr)) {
>>>>> + if (f2fs_is_pinned_file(inode)) {
>>>>> + err = -EIO;
>>>>> + goto out;
>>>>> + }
>>>>> +
>>>>> if (locked) {
>>>>> err = f2fs_reserve_block(&dn, index);
>>>>> goto out;
>>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>>>>> index 82277e95c88f..4db3b21c804b 100644
>>>>> --- a/fs/f2fs/file.c
>>>>> +++ b/fs/f2fs/file.c
>>>>> @@ -57,7 +57,7 @@ static vm_fault_t f2fs_vm_page_mkwrite(struct vm_fault *vmf)
>>>>> struct inode *inode = file_inode(vmf->vma->vm_file);
>>>>> struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
>>>>> struct dnode_of_data dn;
>>>>> - bool need_alloc = true;
>>>>> + bool need_alloc = !f2fs_is_pinned_file(inode);
>>>>
>>>> Will this check races w/ pinfile get|set?
>>>
>>> Do you mean "set/clear" case? I believe "set" case is okay, since we
>>
>> Yup,
>>
>>> can't set if the inode already has a data block. For "clear" case, I
>>
>> However, we can set pinfile on written inode in regular block device:
>
> You're right. I missed it. Maybe I think we should keep the concept
> consistent across devices regardless of zoned storage support. How
> about preventing file pinning for already written inodes across all
> device types? I am changing the pinfile concept by allowing the users
I'm okay with that, or we can tries to migrate data block of target file
from seq-zone to conv-zone or regular-device before setting it w/ pin flag...
Thanks,
> to write on only fallocate()-ed space.
>
>>
>> if (f2fs_sb_has_blkzoned(sbi) && F2FS_HAS_BLOCKS(inode)) {
>> ret = -EFBIG;
>> goto out;
>> }
>>
>> Should we add the logic only if blkzoned feture is enabled?
>>
>>> believe mkwrite failure is okay in racy conditions caused by clearing
>>> the pin flag. What do you think?
>>
>> Or we can use filemap_invalidate_lock() in f2fs_ioc_set_pin_file() to
>> avoid the race condition?
>>
>> Thanks,
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>>> int err = 0;
>>>>> vm_fault_t ret;
>>>>>
>>>>> @@ -114,19 +114,15 @@ static vm_fault_t f2fs_vm_page_mkwrite(struct vm_fault *vmf)
>>>>> goto out_sem;
>>>>> }
>>>>>
>>>>> + set_new_dnode(&dn, inode, NULL, NULL, 0);
>>>>> if (need_alloc) {
>>>>> /* block allocation */
>>>>> - set_new_dnode(&dn, inode, NULL, NULL, 0);
>>>>> err = f2fs_get_block_locked(&dn, page->index);
>>>>> - }
>>>>> -
>>>>> -#ifdef CONFIG_F2FS_FS_COMPRESSION
>>>>> - if (!need_alloc) {
>>>>> - set_new_dnode(&dn, inode, NULL, NULL, 0);
>>>>> + } else {
>>>>> err = f2fs_get_dnode_of_data(&dn, page->index, LOOKUP_NODE);
>>>>> f2fs_put_dnode(&dn);
>>>>> }
>>>>> -#endif
>>>>> +
>>>>> if (err) {
>>>>> unlock_page(page);
>>>>> goto out_sem;
>>>>> @@ -4611,6 +4607,10 @@ static int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *iter,
>>>>> return ret;
>>>>> }
>>>>>
>>>>> + /* For pinned files, it should be fallocate()-ed in advance. */
>>>>> + if (f2fs_is_pinned_file(inode))
>>>>> + return 0;
>>>>> +
>>>>> /* Do not preallocate blocks that will be written partially in 4KB. */
>>>>> map.m_lblk = F2FS_BLK_ALIGN(pos);
>>>>> map.m_len = F2FS_BYTES_TO_BLK(pos + count);
Powered by blists - more mailing lists