lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <04ef9339-940e-064a-f6cf-75671c956c04@kernel.org>
Date:   Wed, 25 Oct 2017 23:46:19 +0800
From:   Chao Yu <chao@...nel.org>
To:     Yunlong Song <yunlong.song@...wei.com>, jaegeuk@...nel.org,
        yuchao0@...wei.com, yunlong.song@...oud.com
Cc:     miaoxie@...wei.com, bintian.wang@...wei.com,
        linux-fsdevel@...r.kernel.org,
        linux-f2fs-devel@...ts.sourceforge.net,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] f2fs: add cur_reserved_blocks to support soft block
 reservation

On 2017/10/25 22:06, Yunlong Song wrote:
> Hi, Chao,
>     Please see my comments below.
> 
> On 2017/10/25 20:26, Chao Yu wrote:
>> On 2017/10/25 18:02, Yunlong Song wrote:
>>> ping...
>> I've replied in this thread, check your email list please, or you can check the
>> comments in below link:
>>
>> https://patchwork.kernel.org/patch/9909407/
>>
>> Anyway, see comments below.
>>
>>> On 2017/8/18 23:09, Yunlong Song wrote:
>>>> This patch adds cur_reserved_blocks to extend reserved_blocks sysfs
>>>> interface to be soft threshold, which allows user configure it exceeding
>>>> current available user space. To ensure there is enough space for
>>>> supporting system's activation, this patch does not set the reserved space
>>>> to the configured reserved_blocks value at once, instead, it safely
>>>> increase cur_reserved_blocks in dev_valid_block(,node)_count to only take
>>>> up the blocks which are just obsoleted.
>>>>
>>>> Signed-off-by: Yunlong Song <yunlong.song@...wei.com>
>>>> Signed-off-by: Chao Yu <yuchao0@...wei.com>
>>>> ---
>>>>    Documentation/ABI/testing/sysfs-fs-f2fs |  3 ++-
>>>>    fs/f2fs/f2fs.h                          | 13 +++++++++++--
>>>>    fs/f2fs/super.c                         |  3 ++-
>>>>    fs/f2fs/sysfs.c                         | 15 +++++++++++++--
>>>>    4 files changed, 28 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs
>>>> index 11b7f4e..ba282ca 100644
>>>> --- a/Documentation/ABI/testing/sysfs-fs-f2fs
>>>> +++ b/Documentation/ABI/testing/sysfs-fs-f2fs
>>>> @@ -138,7 +138,8 @@ What:        /sys/fs/f2fs/<disk>/reserved_blocks
>>>>    Date:        June 2017
>>>>    Contact:    "Chao Yu" <yuchao0@...wei.com>
>>>>    Description:
>>>> -         Controls current reserved blocks in system.
>>>> +         Controls current reserved blocks in system, the threshold
>>>> +         is soft, it could exceed current available user space.
>>>>      What:        /sys/fs/f2fs/<disk>/gc_urgent
>>>>    Date:        August 2017
>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>>> index 2f20b6b..84ccbdc 100644
>>>> --- a/fs/f2fs/f2fs.h
>>>> +++ b/fs/f2fs/f2fs.h
>>>> @@ -1041,6 +1041,7 @@ struct f2fs_sb_info {
>>>>        block_t discard_blks;            /* discard command candidats */
>>>>        block_t last_valid_block_count;        /* for recovery */
>>>>        block_t reserved_blocks;        /* configurable reserved blocks */
>>>> +    block_t cur_reserved_blocks;        /* current reserved blocks */
>>>>          u32 s_next_generation;            /* for NFS support */
>>>>    @@ -1515,7 +1516,8 @@ static inline int inc_valid_block_count(struct f2fs_sb_info *sbi,
>>>>          spin_lock(&sbi->stat_lock);
>>>>        sbi->total_valid_block_count += (block_t)(*count);
>>>> -    avail_user_block_count = sbi->user_block_count - sbi->reserved_blocks;
>>>> +    avail_user_block_count = sbi->user_block_count -
>>>> +                        sbi->cur_reserved_blocks;
>>>>        if (unlikely(sbi->total_valid_block_count > avail_user_block_count)) {
>>>>            diff = sbi->total_valid_block_count - avail_user_block_count;
>>>>            *count -= diff;
>>>> @@ -1549,6 +1551,10 @@ static inline void dec_valid_block_count(struct f2fs_sb_info *sbi,
>>>>        f2fs_bug_on(sbi, sbi->total_valid_block_count < (block_t) count);
>>>>        f2fs_bug_on(sbi, inode->i_blocks < sectors);
>>>>        sbi->total_valid_block_count -= (block_t)count;
>>>> +    if (sbi->reserved_blocks &&
>>>> +        sbi->reserved_blocks != sbi->cur_reserved_blocks)
>> It's redundent check here...
> I think in most cases, cur_reserved_blocks is equal to reserved_blocks, so we do not need to calculate min any more, otherwise,
> if reserved_blocks is not 0, it will calculate min and set current_reserved_blocks each time.

OK, IMO, in some condition, we can save dirtying cache line to decrease cache
line missing with that check.

>>
>>>> +        sbi->cur_reserved_blocks = min(sbi->reserved_blocks,
>>>> +                    sbi->cur_reserved_blocks + count);
>>>>        spin_unlock(&sbi->stat_lock);
>>>>        f2fs_i_blocks_write(inode, count, false, true);
>>>>    }
>>>> @@ -1695,7 +1701,7 @@ static inline int inc_valid_node_count(struct f2fs_sb_info *sbi,
>>>>        spin_lock(&sbi->stat_lock);
>>>>          valid_block_count = sbi->total_valid_block_count + 1;
>>>> -    if (unlikely(valid_block_count + sbi->reserved_blocks >
>>>> +    if (unlikely(valid_block_count + sbi->cur_reserved_blocks >
>>>>                            sbi->user_block_count)) {
>>>>            spin_unlock(&sbi->stat_lock);
>>>>            goto enospc;
>>>> @@ -1738,6 +1744,9 @@ static inline void dec_valid_node_count(struct f2fs_sb_info *sbi,
>>>>          sbi->total_valid_node_count--;
>>>>        sbi->total_valid_block_count--;
>>>> +    if (sbi->reserved_blocks &&
>>>> +        sbi->reserved_blocks != sbi->cur_reserved_blocks)
>> Checking low boundary is more safe here.
> I think cur_reserved_blocks can never be larger than reserved_blocks in any case. so min(reserved_blocks,
> cur_reserved_blocks +1) is same to cur_reserved_blocks++ when reserved_blocks != cur_reserved_blocks
> (which means reserved_blocks > cur_reserved_block )

Ditto.

>>
>>>> +        sbi->cur_reserved_blocks++;
>>>>          spin_unlock(&sbi->stat_lock);
>>>>    diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>>> index 4c1bdcb..16a805f 100644
>>>> --- a/fs/f2fs/super.c
>>>> +++ b/fs/f2fs/super.c
>>>> @@ -957,7 +957,7 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
>>>>        buf->f_blocks = total_count - start_count;
>>>>        buf->f_bfree = user_block_count - valid_user_blocks(sbi) + ovp_count;
>>>>        buf->f_bavail = user_block_count - valid_user_blocks(sbi) -
>>>> -                        sbi->reserved_blocks;
>>>> +                        sbi->cur_reserved_blocks;
>>>>          avail_node_count = sbi->total_node_count - F2FS_RESERVED_NODE_NUM;
>>>>    @@ -2411,6 +2411,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>>                    le64_to_cpu(sbi->ckpt->valid_block_count);
>>>>        sbi->last_valid_block_count = sbi->total_valid_block_count;
>>>>        sbi->reserved_blocks = 0;
>>>> +    sbi->cur_reserved_blocks = 0;
>>>>          for (i = 0; i < NR_INODE_TYPE; i++) {
>>>>            INIT_LIST_HEAD(&sbi->inode_list[i]);
>>>> diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
>>>> index a1be5ac..75c37bb 100644
>>>> --- a/fs/f2fs/sysfs.c
>>>> +++ b/fs/f2fs/sysfs.c
>>>> @@ -104,12 +104,22 @@ static ssize_t features_show(struct f2fs_attr *a,
>>>>        return len;
>>>>    }
>>>>    +static ssize_t f2fs_reserved_blocks_show(struct f2fs_attr *a,
>>>> +        struct f2fs_sb_info *sbi, char *buf)
>>>> +{
>>>> +    return snprintf(buf, PAGE_SIZE, "expected: %u\ncurrent: %u\n",
>>>> +            sbi->reserved_blocks, sbi->cur_reserved_blocks);
>>>> +}
>>>> +
>>>>    static ssize_t f2fs_sbi_show(struct f2fs_attr *a,
>>>>                struct f2fs_sb_info *sbi, char *buf)
>>>>    {
>>>>        unsigned char *ptr = NULL;
>>>>        unsigned int *ui;
>>>>    +    if (a->struct_type == RESERVED_BLOCKS)
>>>> +        return f2fs_reserved_blocks_show(a, sbi, buf);
>>>> +
>>>>        ptr = __struct_ptr(sbi, a->struct_type);
>>>>        if (!ptr)
>>>>            return -EINVAL;
>>>> @@ -143,12 +153,13 @@ static ssize_t f2fs_sbi_store(struct f2fs_attr *a,
>>>>    #endif
>>>>        if (a->struct_type == RESERVED_BLOCKS) {
>>>>            spin_lock(&sbi->stat_lock);
>>>> -        if ((unsigned long)sbi->total_valid_block_count + t >
>>>> -                (unsigned long)sbi->user_block_count) {
>>>> +        if (t > (unsigned long)sbi->user_block_count) {
>>>>                spin_unlock(&sbi->stat_lock);
>>>>                return -EINVAL;
>>>>            }
>>>>            *ui = t;
>>>> +        if (t < (unsigned long)sbi->cur_reserved_blocks)
>>>> +            sbi->cur_reserved_blocks = t;
>> No, for 't < cur_reserved_blocks' case, cur_reserved_blocks will out of update
>> even if there is enough free space. You know, for soft block resevation, we need
>> to reserve blocks as many as possible, making free space being zero suddenly is
>> possible.
> I do not understand why it is not safe to decrease cur_reserved_blocks, for example,
> if current cur_reserved_blocks is 100, now decrease it to 80, is there any problem?
> If 80 will make free space zero, how does 100 exist?
> And I do not think it is safe as following:
>          *ui = t;
> +        sbi->current_reserved_blocks = min(sbi->reserved_blocks,
> +                sbi->user_block_count - valid_user_blocks(sbi));
> 
> If user_block_count = 200, valid_user_blocks=150, reserved_blocks = 100,
> then current_reserved_block = min(100,200-50) = 50, in this case, free space
> is suddenly becoming zero.
Free space becomes zero suddenly is safe, as I said before, I don't expect this
feature can be used in android, instead, it may be used in distributed storage
scenario, in where, once we configure soft_reserved_block making one server out
of free space, it's not critical issue to this system since we can move current
copy to another server which has enough free space.

Secondly, as an global configuration, it's due to usage of administrator with
it, if there is critical application which is sensitive with free space,
administrator should make sure our reservation should not overload consuming free
space, which means soft reservation is not suitable.

> To avoid this, I change the code to:
> 
> +        if (t < (unsigned long)sbi->cur_reserved_blocks)
> +            sbi->cur_reserved_blocks = t;
> 
> I think it is only safe to decrease the value of cur_reserved_blocks, and leave increase operation to
> dec_valid_block(,node)_count, it is safe to increase cur_reserved_blocks there.

For initialization of reserved_blocks, cur_reserved_blocks will always be zero
due to this check, and will be updated to close to reserved_blocks after block
allocation and deallocation of user, IMO, it's not looks reasonable to user.

Anyway, it's due to how you define semantics of soft reservation, so what is your
understanding of it?

Thanks,

>>
>> Thanks,
>>
>>>>            spin_unlock(&sbi->stat_lock);
>>>>            return count;
>>>>        }
>> .
>>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ