lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 9 Sep 2019 15:54:00 +0800
From:   Chao Yu <yuchao0@...wei.com>
To:     Jaegeuk Kim <jaegeuk@...nel.org>
CC:     <linux-kernel@...r.kernel.org>,
        <linux-f2fs-devel@...ts.sourceforge.net>
Subject: Re: [f2fs-dev] [PATCH 2/2] f2fs: avoid infinite GC loop due to stale
 atomic files

On 2019/9/9 15:30, Jaegeuk Kim wrote:
> On 09/09, Chao Yu wrote:
>> On 2019/9/9 9:25, Jaegeuk Kim wrote:
>>> If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
>>> get commited pages but atomic_file being still set like:
>>>
>>> - inmem:    0, atomic IO:    4 (Max.   10), volatile IO:    0 (Max.    0)
>>>
>>> If GC selects this block, we can get an infinite loop like this:
>>>
>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
>>> f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
>>> f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
>>> f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
>>> f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
>>>
>>> In that moment, we can observe:
>>>
>>> [Before]
>>> Try to move 5084219 blocks (BG: 384508)
>>>   - data blocks : 4962373 (274483)
>>>   - node blocks : 121846 (110025)
>>> Skipped : atomic write 4534686 (10)
>>>
>>> [After]
>>> Try to move 5088973 blocks (BG: 384508)
>>>   - data blocks : 4967127 (274483)
>>>   - node blocks : 121846 (110025)
>>> Skipped : atomic write 4539440 (10)
>>>
>>> Signed-off-by: Jaegeuk Kim <jaegeuk@...nel.org>
>>> ---
>>>  fs/f2fs/file.c | 10 +++++-----
>>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>>> index 7ae2f3bd8c2f..68b6da734e5f 100644
>>> --- a/fs/f2fs/file.c
>>> +++ b/fs/f2fs/file.c
>>> @@ -1997,11 +1997,11 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
>>>  			goto err_out;
>>>  
>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
>>> -		if (!ret) {
>>> -			clear_inode_flag(inode, FI_ATOMIC_FILE);
>>> -			F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>>> -			stat_dec_atomic_write(inode);
>>> -		}
>>> +
>>> +		/* doesn't need to check error */
>>> +		clear_inode_flag(inode, FI_ATOMIC_FILE);
>>> +		F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
>>> +		stat_dec_atomic_write(inode);
>>
>> If there are still valid atomic write pages linked in .inmem_pages, it may cause
>> memory leak when we just clear FI_ATOMIC_FILE flag.
> 
> f2fs_commit_inmem_pages() should have flushed them.

Oh, we failed to flush its nodes.

However we won't clear such info if we failed to flush inmen pages, it looks
inconsistent.

Any interface needed to drop inmem pages or clear ATOMIC_FILE flag in that two
error path? I'm not very clear how sqlite handle such error.

Thanks,

> 
>>
>> So my question is why below logic didn't handle such condition well?
>>
>> f2fs_gc()
>>
>> 	if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
>> 		if (skipped_round <= MAX_SKIP_GC_COUNT ||
>> 					skipped_round * 2 < round) {
>> 			segno = NULL_SEGNO;
>> 			goto gc_more;
>> 		}
>>
>> 		if (first_skipped < last_skipped &&
>> 				(last_skipped - first_skipped) >
>> 						sbi->skipped_gc_rwsem) {
>> 			f2fs_drop_inmem_pages_all(sbi, true);
> 
> This is doing nothing, since f2fs_commit_inmem_pages() removed the inode
> from inmem list.
> 
>> 			segno = NULL_SEGNO;
>> 			goto gc_more;
>> 		}
>> 		if (gc_type == FG_GC && !is_sbi_flag_set(sbi, SBI_CP_DISABLED))
>> 			ret = f2fs_write_checkpoint(sbi, &cpc);
>> 	}
>>
>>>  	} else {
>>>  		ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
>>>  	}
>>>
> .
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ