lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 24 Nov 2018 17:36:53 +0800
From:   Chao Yu <yuchao0@...wei.com>
To:     Sahitya Tummala <stummala@...eaurora.org>
CC:     Jaegeuk Kim <jaegeuk@...nel.org>,
        <linux-f2fs-devel@...ts.sourceforge.net>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] f2fs: fix sbi->extent_list corruption issue

On 2018/11/23 18:19, Sahitya Tummala wrote:
> On Fri, Nov 23, 2018 at 05:52:16PM +0800, Chao Yu wrote:
>> On 2018/11/23 11:42, Sahitya Tummala wrote:
>>> On Thu, Nov 22, 2018 at 04:11:07AM -0800, Jaegeuk Kim wrote:
>>>> On 11/22, Chao Yu wrote:
>>>>> On 2018/11/22 18:59, Sahitya Tummala wrote:
>>>>>> When there is a failure in f2fs_fill_super() after/during
>>>>>> the recovery of fsync'd nodes, it frees the current sbi and
>>>>>> retries again. This time the mount is successful, but the files
>>>>>> that got recovered before retry, still holds the extent tree,
>>>>>> whose extent nodes list is corrupted since sbi and sbi->extent_list
>>>>>> is freed up. The list_del corruption issue is observed when the
>>>>>> file system is getting unmounted and when those recoverd files extent
>>>>>> node is being freed up in the below context.
>>>>>>
>>>>>> list_del corruption. prev->next should be fffffff1e1ef5480, but was (null)
>>>>>> <...>
>>>>>> kernel BUG at kernel/msm-4.14/lib/list_debug.c:53!
>>>>>> task: fffffff1f46f2280 task.stack: ffffff8008068000
>>>>>> lr : __list_del_entry_valid+0x94/0xb4
>>>>>> pc : __list_del_entry_valid+0x94/0xb4
>>>>>> <...>
>>>>>> Call trace:
>>>>>> __list_del_entry_valid+0x94/0xb4
>>>>>> __release_extent_node+0xb0/0x114
>>>>>> __free_extent_tree+0x58/0x7c
>>>>>> f2fs_shrink_extent_tree+0xdc/0x3b0
>>>>>> f2fs_leave_shrinker+0x28/0x7c
>>>>>> f2fs_put_super+0xfc/0x1e0
>>>>>> generic_shutdown_super+0x70/0xf4
>>>>>> kill_block_super+0x2c/0x5c
>>>>>> kill_f2fs_super+0x44/0x50
>>>>>> deactivate_locked_super+0x60/0x8c
>>>>>> deactivate_super+0x68/0x74
>>>>>> cleanup_mnt+0x40/0x78
>>>>>> __cleanup_mnt+0x1c/0x28
>>>>>> task_work_run+0x48/0xd0
>>>>>> do_notify_resume+0x678/0xe98
>>>>>> work_pending+0x8/0x14
>>>>>>
>>>>>> Fix this by cleaning up the extent tree of those recovered files
>>>>>> before freeing up sbi and before next retry.
>>>>>
>>>>> Would it be more clear to call shrink_dcache_sb earlier to invalid all
>>>>> inodes and call f2fs_shrink_extent_tree release cached entries and trees in
>>>>> error path?
>>>>
>>>> Agreed.
>>>>
>>> I have tried doing shrink_dcache_sb() earlier but that doesn't call
>>> f2fs_shrink_extent_tree(). So I have moved f2fs_join_shrinker() earlier and 
>>> tried calling f2fs_leave_shrinker() in the error path. That helps to clean up
>>> the cached extent nodes. However, I see that extent tree is left intact for
>>
>> I didn't get it, you mean, in error path, after we call shrink_dcache_sb &
>> f2fs_leave_shrinker, for those recovered files, their extent nodes were
>> evicted, but their extent trees are still in cache?
>>
> 
> Yes, only extent tree is present with zero extent nodes as
> f2fs_leave_shrinker() is only clearing the exntent nodes from
> sbi->extent_list.

Oh, recovered inodes are in cache due to we didn't call evict_inodes, so
they are still referenced to extent tree...

How about calling evict_inodes after shrink_dcache_sb?

Thanks,

> 
>>> those recovered files, which should not be a problem as it gets freed as part
>>> of next umount/rm. Only one small problem I see with this is - during rm/umount when
>>> those previoulsy recovered files are being evicted, extent tree memory gets
>>> free'd but the counter sbi->total_ext_tree gets invalid as these recovered
>>> files are not present as part of current sbi->extent_tree_root. So i have come
>>> up with this patch below to fix this. Let me know if this looks good?
>>>
>>> diff --git a/fs/f2fs/extent_cache.c b/fs/f2fs/extent_cache.c
>>> index 1cb0fcc..3e4801e 100644
>>> --- a/fs/f2fs/extent_cache.c
>>> +++ b/fs/f2fs/extent_cache.c
>>> @@ -654,9 +654,9 @@ unsigned int f2fs_shrink_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink)
>>>  		}
>>>  		f2fs_bug_on(sbi, atomic_read(&et->node_cnt));
>>>  		list_del_init(&et->list);
>>> -		radix_tree_delete(&sbi->extent_tree_root, et->ino);
>>> +		if (radix_tree_delete(&sbi->extent_tree_root, et->ino))
>>> +			atomic_dec(&sbi->total_ext_tree);
>>>  		kmem_cache_free(extent_tree_slab, et);
>>> -		atomic_dec(&sbi->total_ext_tree);
>>>  		atomic_dec(&sbi->total_zombie_tree);
>>>  		tree_cnt++;
>>>  
>>> @@ -767,7 +767,8 @@ void f2fs_destroy_extent_tree(struct inode *inode)
>>>  	/* delete extent tree entry in radix tree */
>>>  	mutex_lock(&sbi->extent_tree_lock);
>>>  	f2fs_bug_on(sbi, atomic_read(&et->node_cnt));
>>> -	radix_tree_delete(&sbi->extent_tree_root, inode->i_ino);
>>> +	if (radix_tree_delete(&sbi->extent_tree_root, inode->i_ino))
>>> +		atomic_dec(&sbi->total_ext_tree);
>>>  	kmem_cache_free(extent_tree_slab, et);
>>>  	atomic_dec(&sbi->total_ext_tree);
>>>  	mutex_unlock(&sbi->extent_tree_lock);
>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>> index af58b2c..3e5588f 100644
>>> --- a/fs/f2fs/super.c
>>> +++ b/fs/f2fs/super.c
>>> @@ -3295,6 +3295,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>  	if (err)
>>>  		goto free_root_inode;
>>>  
>>> +	f2fs_join_shrinker(sbi);
>>>  #ifdef CONFIG_QUOTA
>>>  	/* Enable quota usage during mount */
>>>  	if (f2fs_sb_has_quota_ino(sb) && !f2fs_readonly(sb)) {
>>> @@ -3379,8 +3380,6 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>  			sbi->valid_super_block ? 1 : 2, err);
>>>  	}
>>>  
>>> -	f2fs_join_shrinker(sbi);
>>> -
>>>  	f2fs_tuning_parameters(sbi);
>>>  
>>>  	f2fs_msg(sbi->sb, KERN_NOTICE, "Mounted with checkpoint version = %llx",
>>> @@ -3402,6 +3401,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>  	 * falls into an infinite loop in f2fs_sync_meta_pages().
>>>  	 */
>>>  	truncate_inode_pages_final(META_MAPPING(sbi));
>>> +	shrink_dcache_sb(sb);
>>> +	f2fs_leave_shrinker(sbi);
>>
>> Why not just calling f2fs_shrink_extent_tree(sbi, __count_extent_cache(sbi)); ?
>>
> 
> Sure, will update in the next patchset.
> 
>> Thanks,
>>
>>>  	f2fs_unregister_sysfs(sbi);
>>>  free_root_inode:
>>>  	dput(sb->s_root);
>>> @@ -3445,7 +3446,6 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>  	/* give only one another chance */
>>>  	if (retry) {
>>>  		retry = false;
>>> -		shrink_dcache_sb(sb);
>>>  		goto try_onemore;
>>>  	}
>>>  	return err;
>>>
>>
> 

Powered by blists - more mailing lists