lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 26 Oct 2017 11:07:51 +0800
From:   Yunlong Song <yunlong.song@...wei.com>
To:     Chao Yu <chao@...nel.org>, <jaegeuk@...nel.org>,
        <yuchao0@...wei.com>, <yunlong.song@...oud.com>
CC:     <miaoxie@...wei.com>, <bintian.wang@...wei.com>,
        <linux-fsdevel@...r.kernel.org>,
        <linux-f2fs-devel@...ts.sourceforge.net>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3] f2fs: add cur_reserved_blocks to support soft block
 reservation

Yes, I agree with the soft semantic you introduce, it is too slow to 
increase cur_reserved_blocks only in
dec_valid_block(,node)_count, e.g. if users want to set 
cur_reserved_blocks to 10G.

Then how about fix the initialization of cur_reserved_blocks in 
fs/f2fs/super.c as following:
sbi->current_reserved_blocks = 0
change to
sbi->current_reserved_blocks = min(sbi->reserved_blocks, 
sbi->user_block_count - valid_user_blocks(sbi));

On 2017/10/25 23:46, Chao Yu wrote:
> On 2017/10/25 22:06, Yunlong Song wrote:
>> Hi, Chao,
>>      Please see my comments below.
>>
>> On 2017/10/25 20:26, Chao Yu wrote:
>>> On 2017/10/25 18:02, Yunlong Song wrote:
>>>> ping...
>>> I've replied in this thread, check your email list please, or you can check the
>>> comments in below link:
>>>
>>> https://patchwork.kernel.org/patch/9909407/
>>>
>>> Anyway, see comments below.
>>>
>>>> On 2017/8/18 23:09, Yunlong Song wrote:
>>>>> This patch adds cur_reserved_blocks to extend reserved_blocks sysfs
>>>>> interface to be soft threshold, which allows user configure it exceeding
>>>>> current available user space. To ensure there is enough space for
>>>>> supporting system's activation, this patch does not set the reserved space
>>>>> to the configured reserved_blocks value at once, instead, it safely
>>>>> increase cur_reserved_blocks in dev_valid_block(,node)_count to only take
>>>>> up the blocks which are just obsoleted.
>>>>>
>>>>> Signed-off-by: Yunlong Song <yunlong.song@...wei.com>
>>>>> Signed-off-by: Chao Yu <yuchao0@...wei.com>
>>>>> ---
>>>>>     Documentation/ABI/testing/sysfs-fs-f2fs |  3 ++-
>>>>>     fs/f2fs/f2fs.h                          | 13 +++++++++++--
>>>>>     fs/f2fs/super.c                         |  3 ++-
>>>>>     fs/f2fs/sysfs.c                         | 15 +++++++++++++--
>>>>>     4 files changed, 28 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs
>>>>> index 11b7f4e..ba282ca 100644
>>>>> --- a/Documentation/ABI/testing/sysfs-fs-f2fs
>>>>> +++ b/Documentation/ABI/testing/sysfs-fs-f2fs
>>>>> @@ -138,7 +138,8 @@ What:        /sys/fs/f2fs/<disk>/reserved_blocks
>>>>>     Date:        June 2017
>>>>>     Contact:    "Chao Yu" <yuchao0@...wei.com>
>>>>>     Description:
>>>>> -         Controls current reserved blocks in system.
>>>>> +         Controls current reserved blocks in system, the threshold
>>>>> +         is soft, it could exceed current available user space.
>>>>>       What:        /sys/fs/f2fs/<disk>/gc_urgent
>>>>>     Date:        August 2017
>>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>>>> index 2f20b6b..84ccbdc 100644
>>>>> --- a/fs/f2fs/f2fs.h
>>>>> +++ b/fs/f2fs/f2fs.h
>>>>> @@ -1041,6 +1041,7 @@ struct f2fs_sb_info {
>>>>>         block_t discard_blks;            /* discard command candidats */
>>>>>         block_t last_valid_block_count;        /* for recovery */
>>>>>         block_t reserved_blocks;        /* configurable reserved blocks */
>>>>> +    block_t cur_reserved_blocks;        /* current reserved blocks */
>>>>>           u32 s_next_generation;            /* for NFS support */
>>>>>     @@ -1515,7 +1516,8 @@ static inline int inc_valid_block_count(struct f2fs_sb_info *sbi,
>>>>>           spin_lock(&sbi->stat_lock);
>>>>>         sbi->total_valid_block_count += (block_t)(*count);
>>>>> -    avail_user_block_count = sbi->user_block_count - sbi->reserved_blocks;
>>>>> +    avail_user_block_count = sbi->user_block_count -
>>>>> +                        sbi->cur_reserved_blocks;
>>>>>         if (unlikely(sbi->total_valid_block_count > avail_user_block_count)) {
>>>>>             diff = sbi->total_valid_block_count - avail_user_block_count;
>>>>>             *count -= diff;
>>>>> @@ -1549,6 +1551,10 @@ static inline void dec_valid_block_count(struct f2fs_sb_info *sbi,
>>>>>         f2fs_bug_on(sbi, sbi->total_valid_block_count < (block_t) count);
>>>>>         f2fs_bug_on(sbi, inode->i_blocks < sectors);
>>>>>         sbi->total_valid_block_count -= (block_t)count;
>>>>> +    if (sbi->reserved_blocks &&
>>>>> +        sbi->reserved_blocks != sbi->cur_reserved_blocks)
>>> It's redundent check here...
>> I think in most cases, cur_reserved_blocks is equal to reserved_blocks, so we do not need to calculate min any more, otherwise,
>> if reserved_blocks is not 0, it will calculate min and set current_reserved_blocks each time.
> OK, IMO, in some condition, we can save dirtying cache line to decrease cache
> line missing with that check.
>
>>>>> +        sbi->cur_reserved_blocks = min(sbi->reserved_blocks,
>>>>> +                    sbi->cur_reserved_blocks + count);
>>>>>         spin_unlock(&sbi->stat_lock);
>>>>>         f2fs_i_blocks_write(inode, count, false, true);
>>>>>     }
>>>>> @@ -1695,7 +1701,7 @@ static inline int inc_valid_node_count(struct f2fs_sb_info *sbi,
>>>>>         spin_lock(&sbi->stat_lock);
>>>>>           valid_block_count = sbi->total_valid_block_count + 1;
>>>>> -    if (unlikely(valid_block_count + sbi->reserved_blocks >
>>>>> +    if (unlikely(valid_block_count + sbi->cur_reserved_blocks >
>>>>>                             sbi->user_block_count)) {
>>>>>             spin_unlock(&sbi->stat_lock);
>>>>>             goto enospc;
>>>>> @@ -1738,6 +1744,9 @@ static inline void dec_valid_node_count(struct f2fs_sb_info *sbi,
>>>>>           sbi->total_valid_node_count--;
>>>>>         sbi->total_valid_block_count--;
>>>>> +    if (sbi->reserved_blocks &&
>>>>> +        sbi->reserved_blocks != sbi->cur_reserved_blocks)
>>> Checking low boundary is more safe here.
>> I think cur_reserved_blocks can never be larger than reserved_blocks in any case. so min(reserved_blocks,
>> cur_reserved_blocks +1) is same to cur_reserved_blocks++ when reserved_blocks != cur_reserved_blocks
>> (which means reserved_blocks > cur_reserved_block )
> Ditto.
>
>>>>> +        sbi->cur_reserved_blocks++;
>>>>>           spin_unlock(&sbi->stat_lock);
>>>>>     diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>>>> index 4c1bdcb..16a805f 100644
>>>>> --- a/fs/f2fs/super.c
>>>>> +++ b/fs/f2fs/super.c
>>>>> @@ -957,7 +957,7 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
>>>>>         buf->f_blocks = total_count - start_count;
>>>>>         buf->f_bfree = user_block_count - valid_user_blocks(sbi) + ovp_count;
>>>>>         buf->f_bavail = user_block_count - valid_user_blocks(sbi) -
>>>>> -                        sbi->reserved_blocks;
>>>>> +                        sbi->cur_reserved_blocks;
>>>>>           avail_node_count = sbi->total_node_count - F2FS_RESERVED_NODE_NUM;
>>>>>     @@ -2411,6 +2411,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>>>                     le64_to_cpu(sbi->ckpt->valid_block_count);
>>>>>         sbi->last_valid_block_count = sbi->total_valid_block_count;
>>>>>         sbi->reserved_blocks = 0;
>>>>> +    sbi->cur_reserved_blocks = 0;
>>>>>           for (i = 0; i < NR_INODE_TYPE; i++) {
>>>>>             INIT_LIST_HEAD(&sbi->inode_list[i]);
>>>>> diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
>>>>> index a1be5ac..75c37bb 100644
>>>>> --- a/fs/f2fs/sysfs.c
>>>>> +++ b/fs/f2fs/sysfs.c
>>>>> @@ -104,12 +104,22 @@ static ssize_t features_show(struct f2fs_attr *a,
>>>>>         return len;
>>>>>     }
>>>>>     +static ssize_t f2fs_reserved_blocks_show(struct f2fs_attr *a,
>>>>> +        struct f2fs_sb_info *sbi, char *buf)
>>>>> +{
>>>>> +    return snprintf(buf, PAGE_SIZE, "expected: %u\ncurrent: %u\n",
>>>>> +            sbi->reserved_blocks, sbi->cur_reserved_blocks);
>>>>> +}
>>>>> +
>>>>>     static ssize_t f2fs_sbi_show(struct f2fs_attr *a,
>>>>>                 struct f2fs_sb_info *sbi, char *buf)
>>>>>     {
>>>>>         unsigned char *ptr = NULL;
>>>>>         unsigned int *ui;
>>>>>     +    if (a->struct_type == RESERVED_BLOCKS)
>>>>> +        return f2fs_reserved_blocks_show(a, sbi, buf);
>>>>> +
>>>>>         ptr = __struct_ptr(sbi, a->struct_type);
>>>>>         if (!ptr)
>>>>>             return -EINVAL;
>>>>> @@ -143,12 +153,13 @@ static ssize_t f2fs_sbi_store(struct f2fs_attr *a,
>>>>>     #endif
>>>>>         if (a->struct_type == RESERVED_BLOCKS) {
>>>>>             spin_lock(&sbi->stat_lock);
>>>>> -        if ((unsigned long)sbi->total_valid_block_count + t >
>>>>> -                (unsigned long)sbi->user_block_count) {
>>>>> +        if (t > (unsigned long)sbi->user_block_count) {
>>>>>                 spin_unlock(&sbi->stat_lock);
>>>>>                 return -EINVAL;
>>>>>             }
>>>>>             *ui = t;
>>>>> +        if (t < (unsigned long)sbi->cur_reserved_blocks)
>>>>> +            sbi->cur_reserved_blocks = t;
>>> No, for 't < cur_reserved_blocks' case, cur_reserved_blocks will out of update
>>> even if there is enough free space. You know, for soft block resevation, we need
>>> to reserve blocks as many as possible, making free space being zero suddenly is
>>> possible.
>> I do not understand why it is not safe to decrease cur_reserved_blocks, for example,
>> if current cur_reserved_blocks is 100, now decrease it to 80, is there any problem?
>> If 80 will make free space zero, how does 100 exist?
>> And I do not think it is safe as following:
>>           *ui = t;
>> +        sbi->current_reserved_blocks = min(sbi->reserved_blocks,
>> +                sbi->user_block_count - valid_user_blocks(sbi));
>>
>> If user_block_count = 200, valid_user_blocks=150, reserved_blocks = 100,
>> then current_reserved_block = min(100,200-50) = 50, in this case, free space
>> is suddenly becoming zero.
> Free space becomes zero suddenly is safe, as I said before, I don't expect this
> feature can be used in android, instead, it may be used in distributed storage
> scenario, in where, once we configure soft_reserved_block making one server out
> of free space, it's not critical issue to this system since we can move current
> copy to another server which has enough free space.
>
> Secondly, as an global configuration, it's due to usage of administrator with
> it, if there is critical application which is sensitive with free space,
> administrator should make sure our reservation should not overload consuming free
> space, which means soft reservation is not suitable.
>
>> To avoid this, I change the code to:
>>
>> +        if (t < (unsigned long)sbi->cur_reserved_blocks)
>> +            sbi->cur_reserved_blocks = t;
>>
>> I think it is only safe to decrease the value of cur_reserved_blocks, and leave increase operation to
>> dec_valid_block(,node)_count, it is safe to increase cur_reserved_blocks there.
> For initialization of reserved_blocks, cur_reserved_blocks will always be zero
> due to this check, and will be updated to close to reserved_blocks after block
> allocation and deallocation of user, IMO, it's not looks reasonable to user.
>
> Anyway, it's due to how you define semantics of soft reservation, so what is your
> understanding of it?
>
> Thanks,
>
>>> Thanks,
>>>
>>>>>             spin_unlock(&sbi->stat_lock);
>>>>>             return count;
>>>>>         }
>>> .
>>>
> .
>

-- 
Thanks,
Yunlong Song


Powered by blists - more mailing lists