lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20220628042012.60675-1-bo.wu@vivo.com>
Date:   Tue, 28 Jun 2022 12:20:12 +0800
From:   Wu Bo <bo.wu@...o.com>
To:     zhangjiachen.jaycee@...edance.com, vgoyal@...hat.com
Cc:     miklos@...redi.hu, bo.wu@...o.com, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] fuse: force sync attr when inode is invalidated

Hi Vivek and Jiachen,

Sorry for late reply. It took time to build open email infrastructure in office.

> On Fri, Jun 24, 2022 at 10:27 AM Jiachen Zhang
> <zhangjiachen.jaycee@...edance.com> wrote:
>>
>> On Fri, Jun 24, 2022 at 3:26 AM Vivek Goyal <vgoyal@...hat.com> wrote:
>>>
>>> On Tue, Jun 21, 2022 at 08:56:51PM +0800, wubo wrote:
>>>> From: Wu Bo <bo.wu@...o.com>
>>>>
>>>> Now the fuse driver only trust it's local inode size when
>>>> writeback_cache is enabled. Even the userspace server tell the driver
>>>> the inode cache is invalidated, the size attrabute will not update. And
>>>> will keep it's out-of-date size till the inode cache is dropped. This is
>>>> not reasonable.
>>>
>>> BTW, can you give more details about what's the use case. With
>>> writeback_cache, writes can be cached in fuse and not sent to
>>> file server immediately. And I think that's why fuse trusts
>>> local i_size.
>>>

Let me introduce this use case. It's widely used in Android11 now.
Android11 use fuse for APP to access the public files in order to support the
dynamic permission. To increase the performance at the same time, Android11
allow APPs to write big size files(e.g. jpg or mp4) directly to the lower fs
bypass fuse. So this issue come out:

       APP-a                              APP-b
  Write a jpg file through lowfs
                                     Read the jpg file, size is 1M
  Fish write
  Update the size 2M in DB
  Invalidate the fuse entry
                                     Size keep to be 1M

             [ Drop the inode cache manually ]

	                             Size be update to 2M

If the inode cache keep in kernel, the size will not update forever because of
the writeback mode.

>>> With writeback_cache enabled, I don't think file should be modified
>>> externally (outside the fuse client).
>>>
>>> So what's that use case where file size cached in fuse is out of
>>> date. You probably should not use writeback_cache if you are
>>> modifying files outside the fuse client.

As we all know enable writeback mode can improve the performance. So this
feature can't be disabled in Android. And the key cause of this issue is that
writeback mode in fuse is not so reasonable now. And there are many methods to
fix this issue.

>>>
>>> Having said that I am not sure why FUSE_NOTIFY_INVAL_INODE was added to
>>> begin with. If files are not supposed to be modifed outside the fuse
>>> client, why are we dropping acls and invalidating attrs. If intent is
>>> just to drop page cache, then it should have been just that nothing
>>> else.
>>>

I proposal force attr invalidation. Because I think if user call the
FUSE_NOTIFY_INVAL_INODE as Android11 do it, they definately know what they are
doing. And the fuse driver should do the expected response.

>>> So up to some extent, FUSE_NOTIFY_INVAL_INODE is somewhat confusing. Would
>>> have been good if there was some documentation for it.
>>>
>>> Thanks
>>> Vivek
>>>
>>
>> Hi Wu and Vivek,
>>
>> Recently, we have had some discussions about the writeback_cache
>> revalidation on the mailing list [1][2]. Miklos gave his initial
>> patchset about writeback_cache v2, which supports c/mtime and size
>> updates [1]. However, those methods do not make use of reverse
>> messages, as virtio-fs does not support reverse notification yet. I'm
>> going to send out a new version of that patch based on the discussion
>> and with more considerations.
>
> The new patch:
> https://lore.kernel.org/linux-fsdevel/20220624055825.29183-1-zhangjiachen.jaycee@bytedance.com/
>
> Thanks,
> Jiachen
>
>>
>> I also agree that, semantically, FUSE_NOTIFY_INVAL_INODE should
>> invalidate i_size as well. So I think this patch is a good supplement
>> for FUSE_NOTIFY_INVAL_INODE. But we need to be more careful as the
>> size can be updated from server to kernel, and from kernel to server.
>> I will leave some comments about such issues in the following code.
>>
>> For the use case, writeback_cache is superb over write-through mode in
>> write-intensive scenarios, but its consistency among multiple clients
>> is too bad (almost no consistency). I think it's good to give a little
>> more consistency to writeback_cache.
>>
>> [1] https://lore.kernel.org/linux-fsdevel/20220325132126.61949-1-zhangjiachen.jaycee@bytedance.com/
>> [2] https://lore.kernel.org/linux-fsdevel/20220608104202.19461-1-zhangjiachen.jaycee@bytedance.com/

This is a good solution to this issue. But I think we don't need another
writeback_cache mode, it's time to do the cache consistency under
writeback_cache mode.

>>
>>>>
>>>> Signed-off-by: Wu Bo <bo.wu@...o.com>
>>>> ---
>>>>  fs/fuse/inode.c | 10 +++++++++-
>>>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
>>>> index 8c0665c5dff8..a4e62c7f2b83 100644
>>>> --- a/fs/fuse/inode.c
>>>> +++ b/fs/fuse/inode.c
>>>> @@ -162,6 +162,11 @@ static ino_t fuse_squash_ino(u64 ino64)
>>>>       return ino;
>>>>  }
>>>>
>>>> +static bool fuse_force_sync(struct fuse_inode *fi)
>>>> +{
>>>> +     return fi->i_time == 0;
>>>> +}
>>>> +
>>>>  void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr,
>>>>                                  u64 attr_valid, u32 cache_mask)
>>>>  {
>>>> @@ -222,8 +227,10 @@ void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr,
>>>>  u32 fuse_get_cache_mask(struct inode *inode)
>>>>  {
>>>>       struct fuse_conn *fc = get_fuse_conn(inode);
>>>> +     struct fuse_inode *fi = get_fuse_inode(inode);
>>>> +     bool is_force_sync = fuse_force_sync(fi);
>>>>
>>>> -     if (!fc->writeback_cache || !S_ISREG(inode->i_mode))
>>>> +     if (!fc->writeback_cache || !S_ISREG(inode->i_mode) || is_force_sync)
>>>>               return 0;
>>>>
>>>>       return STATX_MTIME | STATX_CTIME | STATX_SIZE;
>>>> @@ -437,6 +444,7 @@ int fuse_reverse_inval_inode(struct fuse_conn *fc, u64 nodeid,
>>>>       fi = get_fuse_inode(inode);
>>>>       spin_lock(&fi->lock);
>>>>       fi->attr_version = atomic64_inc_return(&fc->attr_version);
>>>> +     fi->i_time = 0;
>>>>       spin_unlock(&fi->lock);
>>
>> Seems fuse_reverse_inval_inode() only drops page cache from offset to
>> offset+len, should we only invalidate i_time on a full cache drop?
>> Otherwise, as the server size is stale, the users may see a file is
>> truncated.
>>
>> Also, what if a FUSE_GETATTR request gets the attr_version after
>> fuse_reverse_inval_inode() increases it, but tries to update i_size
>> after the invalidate_inode_pages2_range() in
>> fuse_reverse_inval_inode()? In this case, server_size can be updated
>> by invalidate_inode_pages2_range(), and FUSE_GETATTR might gets a
>> stale server_size. Meanwhile, as FUSE_GETATTR has got the newest
>> attr_version, the kernel_size will still be updated. This can cause
>> false truncation even for a single FUSE client. So we may need to do
>> more about the attr_version in writeback mode.
>>
>> Thanks,
>> Jiachen
>>
>>>>
>>>>       fuse_invalidate_attr(inode);
>>>> --
>>>> 2.35.1
>>>>
>>>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ